Utils
RawResourceException
¶
Bases: ValidationError
Raised when trying to ingest a resource that has been marked with "disable_parsing".
Source code in ckanext/versioned_datastore/lib/utils.py
252 253 254 255 256 257 | |
ReadOnlyResourceException
¶
Bases: ValidationError
Raised when a write operation of some variety is attempted on a resource which has been marked as read only.
Source code in ckanext/versioned_datastore/lib/utils.py
243 244 245 246 247 248 249 | |
es_client()
¶
Retrieves an Elasticsearch client for use on the in use cluster. If Splitgill is not configured yet on the VDS plugin, an exception is raised.
Returns:
| Type | Description |
|---|---|
Elasticsearch
|
an Elasticsearch object |
Source code in ckanext/versioned_datastore/lib/utils.py
165 166 167 168 169 170 171 172 173 174 175 | |
get_available_datastore_resources(ignore_auth=False, user_id='')
¶
Simple wrapper around get_available_resources which provides a list of available datastore resources to the currently logged-in user.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
ignore_auth
|
bool
|
whether to ignore authentication (default: False) |
False
|
Returns:
| Type | Description |
|---|---|
Set[str]
|
a set of resource IDs |
Source code in ckanext/versioned_datastore/lib/utils.py
23 24 25 26 27 28 29 30 31 32 33 34 35 | |
get_available_resources(datastore_only, ignore_auth=False, user_id='')
¶
Get a set of resource IDs that are available to the currently logged-in user and, if datastore_only is set to True, are datastore active. If no user is logged-in, all public datastore resource IDs are returned. The resource IDs are returned as a set to enable quick checking between a list of requested IDs and the list of available IDs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
datastore_only
|
bool
|
whether to only return resource IDs that are datastore active |
required |
ignore_auth
|
bool
|
whether to ignore authentication (default: False) |
False
|
Returns:
| Type | Description |
|---|---|
Set[str]
|
a set of resource IDs |
Source code in ckanext/versioned_datastore/lib/utils.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | |
get_database(resource_id)
¶
Retrieves a SplitgillDatabase object for the given resource ID. If the SplitgillClient on the VDS plugin isn't yet configured, an exception is raised.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_id
|
str
|
the resource's ID |
required |
Returns:
| Type | Description |
|---|---|
SplitgillDatabase
|
a SplitgillDatabase |
Source code in ckanext/versioned_datastore/lib/utils.py
140 141 142 143 144 145 146 147 148 149 | |
get_latest_resource_fields(resource_ids)
¶
Retrieves a list of fields available on the latest indices for the given resources.
Does not do any authentication. Only call from within other authenticated functions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_ids
|
List[str]
|
list of resource IDs |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, Dict[str, Dict]]
|
dict of field names, the resources they're found in, and their details within those resources |
Source code in ckanext/versioned_datastore/lib/utils.py
104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 | |
get_latest_version(resource_id)
¶
Retrieves the latest version of the given resource from the status index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_id
|
the resource's id |
required |
Returns:
| Type | Description |
|---|---|
Optional[int]
|
the version or None if the resource isn't indexed |
Source code in ckanext/versioned_datastore/lib/utils.py
191 192 193 194 195 196 197 198 | |
get_public_resources()
¶
Retrieves a list of public resources and whether they are present in the datastore.
Returns:
| Type | Description |
|---|---|
Dict[str, bool]
|
dict of publicly available resource IDs and whether they are available in the datastore |
Source code in ckanext/versioned_datastore/lib/utils.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 | |
get_sg_name(resource_id)
¶
Adds the configured prefix to the start of the resource id to get the index name for the resource data in elasticsearch.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_id
|
str
|
the resource id |
required |
Returns:
| Type | Description |
|---|---|
str
|
the resource's Splitgill database name |
Source code in ckanext/versioned_datastore/lib/utils.py
201 202 203 204 205 206 207 208 209 210 | |
is_datastore_only_resource(resource_url)
¶
Checks whether the resource url is a datastore only resource url. When uploading data directly to the API without using a source file/URL the url of the resource will be set to "_datastore_only_resource" to indicate that as such. This function checks to see if the resource URL provided is one of these URLs. Note that we check a few different scenarios as CKAN has the nasty habit of adding a protocol onto the front of these URLs when saving the resource, sometimes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_url
|
str
|
the URL of the resource |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if the resource is a datastore only resource, False if not |
Source code in ckanext/versioned_datastore/lib/utils.py
285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 | |
is_datastore_resource(resource_id)
¶
Checks if any data has made it to Elasticsearch for this resource ID. Note that this only checks Elasticsearch, it doesn't check MongoDB, and is therefore intended to simply test if there is any searchable data for the resource.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource_id
|
str
|
the resource id |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if the resource is a datastore resource, False if not |
Source code in ckanext/versioned_datastore/lib/utils.py
273 274 275 276 277 278 279 280 281 282 | |
is_ingestible(resource)
¶
Returns True if the resource can be ingested into the datastore and False if not. To be ingestible, the resource must either be a datastore only resource (signified by the url being set to _datastore_only_resource) or have a format that we can ingest (the format field on the resource is used for this, not the URL). If the url is None, False is returned. This is technically not possible due to a Resource model constraint, but it's worth covering off anyway.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
resource
|
dict
|
the resource dict |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if it is, False if not |
Source code in ckanext/versioned_datastore/lib/utils.py
304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 | |
is_resource_read_only(resource_id)
¶
Loops through the plugin implementations checking if any of them want the given resource id to be read only.
Returns:
| Type | Description |
|---|---|
bool
|
True if the resource should be treated as read only, False if not |
Source code in ckanext/versioned_datastore/lib/utils.py
260 261 262 263 264 265 266 267 268 269 270 | |
mongo_client()
¶
Retrieves a Mongo client for use on the in use database instance. If Splitgill is not configured yet on the VDS plugin, an exception is raised.
Returns:
| Type | Description |
|---|---|
MongoClient
|
an MongoClient object |
Source code in ckanext/versioned_datastore/lib/utils.py
178 179 180 181 182 183 184 185 186 187 188 | |
sg_client()
¶
Retrieves a Splitgill client object. If Splitgill is not configured yet on the VDS plugin, an exception is raised.
Returns:
| Type | Description |
|---|---|
SplitgillClient
|
an SplitgillClient object |
Source code in ckanext/versioned_datastore/lib/utils.py
152 153 154 155 156 157 158 159 160 161 162 | |
unprefix_index_name(sg_index_name)
¶
Extracts the resource ID from the given Splitgill index name by removing the Splitgill specific parts, plus removing the prefix (if one is configured). If the resource ID cannot be extracted, a ValueError is raised.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sg_index_name
|
str
|
the Splitgill index name |
required |
Returns:
| Type | Description |
|---|---|
str
|
the resource's ID |
Source code in ckanext/versioned_datastore/lib/utils.py
225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 | |
unprefix_sg_name(sg_name)
¶
Removes the configured prefix from the start of the index name to get the resource id.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sg_name
|
str
|
the Spitgill database name |
required |
Returns:
| Type | Description |
|---|---|
str
|
the resource's id |
Source code in ckanext/versioned_datastore/lib/utils.py
213 214 215 216 217 218 219 220 221 222 | |