Operation

class ramose.Operation(op_complete_url, op_key, i, tp, sparql_http_method, addon)[source]

Bases: object

This class is responsible for materialising a API operation to be run against a SPARQL endpoint.

It takes in input a full URL referring to a call to an operation (parameter ‘op_complete_url’), the particular shape representing an operation (parameter ‘op_key’), the definition (in JSON) of such operation (parameter ‘i’), the URL of the triplestore to contact (parameter ‘tp’), the HTTP method to use for the SPARQL request (paramenter ‘sparql_http_method’, set to either ‘get’ or ‘post’), and the path of the Python file which defines additional functions for use in the operation (parameter ‘addon’).

static get_content_type(ct)[source]: It returns the mime type of a given textual representation of a format, being it either ‘csv’ or ‘json.

static conv(s, query_string, c_type='text/csv')[source]: This method takes a string representing a CSV document and converts it in the requested format according to what content type is specified as input.

static pv(i, r=None)[source]

This method returns the plain value of a particular item ‘i’ of the result returned by the SPARQL query.

In case ‘r’ is specified (i.e. a row containing a set of results), then ‘i’ must be the index of the item within that row.

static tv(i, r=None)[source]

This method returns the typed value of a particular item ‘i’ of the result returned by the SPARQL query. The type associated to that value is actually specified by means of the particular configuration provided in the specification file of the API - field ‘field_type’.

In case ‘r’ is specified (i.e. a row containing a set of results), then ‘i’ must be the index of the item within that row.

static do_overlap(r1, r2)[source]: This method returns a boolean that says if the two ranges (i.e. two pairs of integers) passed as inputs actually overlap one with the other.

static get_item_in_dict(d_or_l, key_list, prev=None)[source]: This method takes as input a dictionary or a list of dictionaries and browses it until the value specified following the chain indicated in ‘key_list’ is not found. It returns a list of all the values that matched with such search.

static add_item_in_dict(d_or_l, key_list, item, idx)[source]: This method takes as input a dictionary or a list of dictionaries, browses it until the value specified following the chain indicated in ‘key_list’ is not found, and then substitutes it with ‘item’. In case the final object retrieved is a list, it selects the object in position ‘idx’ before the substitution.

static structured(params, json_table)[source]

This method checks if there are particular transformation rules specified in ‘params’ for a JSON output, and convert each row of the input table (‘json_table’) according to these rules. There are two specific rules that can be applied:

1. array(“<separator>”,<field>): it converts the string value associated to the field name ‘<field>’ into an array by splitting the various textual parts by means of ‘<separator>’. For instance, consider the following JSON structure:

[: { “names”: “Doe, John; Doe, Jane” }, { “names”: “Doe, John; Smith, John” }

]

Executing the rule ‘array(”; “,names)’ returns the following new JSON structure:

[: { “names”: [ “Doe, John”, “Doe, Jane” ], { “names”: [ “Doe, John”, “Smith, John” ]

]

2. dict(“separator”,<field>,<new_field_1>,<new_field_2>,…): it converts the string value associated to the field name ‘<field>’ into an dictionary by splitting the various textual parts by means of ‘<separator>’ and by associating the new fields ‘<new_field_1>’, ‘<new_field_2>’, etc., to these new parts. For instance, consider the following JSON structure:

[: { “name”: “Doe, John” }, { “name”: “Smith, John” }

]

Executing the rule ‘array(”, “,name,family_name,given_name)’ returns the following new JSON structure:

[: { “name”: { “family_name”: “Doe”, “given_name: “John” } }, { “name”: { “family_name”: “Smith”, “given_name: “John” } }

]

Each of the specified rules is applied in order, and it works on the JSON structure returned after the execution of the previous rule.

preprocess(par_dict, op_item, addon)[source]

This method takes the a dictionary of parameters with the current typed values associated to them and the item of the API specification defining the behaviour of that operation, and preprocesses the parameters according to the functions specified in the ‘#preprocess’ field (e.g. “#preprocess lower(doi)”), which is applied to the specified parameters as input of the function in consideration (e.g. “/api/v1/citations/10.1108/jd-12-2013-0166”, converting the DOI in lowercase).

It is possible to run multiple functions sequentially by concatenating them with “–>” in the API specification document. In this case the output of the function f_i will becomes the input operation URL of the function f_i+1.

Finally, it is worth mentioning that all the functions specified in the “#preprocess” field must return a tuple of values defining how the particular value passed in the dictionary must be changed.

postprocess(res, op_item, addon)[source]

This method takes the result table returned by running the SPARQL query in an API operation (specified as input) and change some of such results according to the functions specified in the ‘#postprocess’ field (e.g. “#postprocess remove_date(“2018”)”). These functions can take parameters as input, while the first unspecified parameters will be always the result table. It is worth mentioning that this result table (i.e. a list of tuples) actually contains, in each cell, a tuple defining the plain value as well as the typed value for enabling better comparisons and operations if needed. An example of this table of result is shown as follows:

[: (“id”, “date”), (“my_id_1”, “my_id_1”), (datetime(2018, 3, 2), “2018-03-02”), …

]

Note that the typed value and the plain value of each cell can be selected by using the methods “tv” and “pv” respectively. In addition, it is possible to run multiple functions sequentially by concatenating them with “–>” in the API specification document. In this case the output of the function f_i will becomes the input result table of the function f_i+1.

handling_params(params, table)[source]

This method is used for filtering the results that are returned after the post-processing phase. In particular, it is possible to:

[require=<field_name>] exclude all the rows that have an empty value in the field specified - e.g. the “require=doi” remove all the rows that do not have any string specified in the “doi” field;
[filter=<field_name>:<operator><value>] consider only the rows where the string in the input field is compliant with the value specified. If no operation is specified, the value is interpreted as a regular expression, otherwise it is compared according to the particular type associated to that field. Possible operators are “=”, “<”, and “>” - e.g. “filter=title:semantics?” returns all the rows that contain the string “semantic” or “semantics” in the field title, while “filter=date:>2016-05” returns all the rows that have a date greater than May 2016;
[sort=<order>(<field_name>)] sort all the results according to the value and type of the particular field specified in input. It is possible to sort the rows either in ascending (“asc”) or descending (“desc”) order - e.g. “sort=desc(date)” sort all the rows according to the value specified in the field “date” in descending order.

Note that these filtering operations are applied in the order presented above - first the “require”, then the “filter”, and finally the “sort”. It is possible to specify one or more filtering operation of the same kind (e.g. “require=doi&require=title”).

type_fields(res, op_item)[source]: It creates a version of the results ‘res’ that adds, to each value of the fields, the same value interpreted with the type specified in the specification file (field ‘field_type’). Note that ‘str’ is used as default in case no further specifications are provided.

remove_types(res)[source]: This method takes the results ‘res’ that include also the typed value and returns a version of such results without the types that is ready to be stored on the file system.

exec(method='get', content_type='application/json')[source]

This method takes in input the the HTTP method to use for the call and the content type to return, and execute the operation as indicated in the specification file, by running (in the following order):

the methods to preprocess the query;
the SPARQL query related to the operation called, by using the parameters indicated in the URL;
the specification of all the types of the various rows returned;
the methods to postprocess the result;
the application of the filter to remove, filter, sort the result;
the removal of the types added at the step 3, so as to have a data structure ready to be returned;
the conversion in the format requested by the user.