Faculty of Electrical Engineering and Computing, University of Zagreb

Department for Electronic Systems and Information Processing

Laboratory for Systems and Signals


WeSerRV

Writing your own transforms

Specifics on writing a Netglub transform in general are described here. These instructions will focus on writing a transform useful for academic research, such as those WeSeRV uses. To write a transform that extracts structured information from a specific web site, it is usually enough to know basic HTTP and HTML to understand how the information is presented there. Once that is understood, it's easy to use Python with the Requests and BeautifulSoup libraries to extract the needed information.

WeSeRV transformations usually have a following structure:


You can test a transform locally like this: Notice the empty double quotes "" as the delimiter between the input entity and parameters. An example of a typical transform including additional explanations is PhraseToBookLOC, using an input phrase to extract information about books from the Library of Congress website (loc.gov). More transform examples can be found here.