Advanced Query Logic
Interact with any one of our journey-based queries for a while, and you'll find that you can start ask some pretty complicated questions. We have thoughtfully built the CareQuery API to meet the SQL-like needs of complex data requests within the context of an API SDK.
To accomplish this capability, we needed to make the logic straightforward, consistent and natural to the expertise, hypothesis and questions of our users.
Parameter Category Options
CareQuery users should able to ask natural language questions in flow, and not spend time punching in codes. In order to achieve this, we have curated a handful of categorically related parameters that all map to the same set of underlying codes.
category | parameters | underlying codes |
---|---|---|
diagnosis | diag_code short_diag_code diag_subcategory diag_category | ICD-10 Code |
procedure | proc_code proc_subcategory proc_category | CPT/HCPCS Code |
specialty | taxonomy subspecialty specialty specialty_category | Taxonomy Code |
geography | region division state metro short_zip_code | State Code or 3-Digit Zip |
As a result, CareQuery users can simplify queries, iterate faster, and focus on solving problems, not retrieving data. CareQuery makes it easy to explore these parameter values and quickly wield them within a query:
# 1) import package
from care_query.care_query import CareQuery
# 2) instantiate and connect
cq = CareQuery(email = "your-email",
token = "your-api-token",
sftp_key = "path/to/your/company.PEM")
# list parameter options
cq.glossary.list_options("diag_subcategory", contains = "pedal cyclist injured in collision")
# 3) build query
bike_ouch = cq.careVisit(diag_subcategory = ['pedal cyclist injured in collision with car, pick-up truck or van',
'pedal cyclist injured in collision with fixed or stationary object',
'pedal cyclist injured in collision with heavy transport vehicle or bus',
'pedal cyclist injured in collision with other nonmotor vehicle',
'pedal cyclist injured in collision with other pedal cycle',
'pedal cyclist injured in collision with pedestrian or animal',
'pedal cyclist injured in collision with railway train or railway vehicle',
'pedal cyclist injured in collision with two- or three-wheeled motor vehicle'],
region = ["west"],
specialty = 'emergency medicine',
min_date = "2023-03-01",
max_date = "2023-06-01")
# 4) submit query, data will return to SFTP endpoint
# could also run .sample() to return the first 1k rows or .execute() to get a size estimate
result_object = bike_ouch.execute()
# 5) check query status
# user will be sent an email upon submission and completion, check query status along the way
bike_ouch.queryStatus()
# 6) upon successful completion of query, read data directly into pandas environment
bike_ouch_data = bike_ouch.sftpToPandas()
The AND/OR Conundrum
Beyond the ease of use and removing the need to code punch, the largest challenge to CareQuery overcomes within the API context is the ability to specify AND and OR logic across (and within) the various parameters specified.
Doing so via the following parameter options and logical rules...
Default to OR logic WITHIN parameters
Any parameters for journey-based queries default to OR logic when values are placed within a list, therefore:
results = cq.careEncounter(proc_code = ["99453","99454","99457","99458"])
...directly translates to...
select *
from careEncounters
where proc_code in ("99453","99454","99457","99458")
Default to OR logic WITHIN parameter categories
Because all parameter categories map to the same underlying codes, this OR logic persists across any hierarchical parameters, therefore:
results = cq.careEncounter(proc_code = ["99453","99454","99457","99458"],
proc_subcategory = ["chemotherapy drugs","radiology - radiation oncology"])
...directly translates to...
select careEncounters
from JOURNEY_TABLE
where proc_code in ("99453","99454","99457","99458")
or proc_subcategory in ("chemotherapy drugs","radiology - radiation oncology")
Default to AND logic ACROSS categorically unrelated parameters
Specification of categorically unrelated parameters are assumed to coincide with one anther, insofar as when you ask for a start date AND an end date AND a location, we can assume that you want data on a specific place within a time range, therefore:
# nebraskans RPM encounters in 2022
results = cq.careEncounter(proc_code = ["99453","99454","99457","99458"],
min_date = "2022-01-01",
max_date = "2023-01-01",
state = "NE")
...directly translates to...
select careEpisodes
from JOURNEY_TABLE
where proc_code in ("99453","99454","99457","99458")
and min_date = "2022-01-01",
and max_date = "2023-01-01",
and state = "NE"
Specify AND logic WITHIN categorically related parameters (with nested lists)
However, someone may want to execute a more complex query, one that entails AND statements WITHIN a specified parameters. One may accomplish this by creating a nested list as the parameter value, insofar as:
# nebraskan episodes of care with RPM and cardiovascular in 2022
results = cq.careEpisode(proc_code = [["99453","99454","99457","99458"],["92953", "93271", "93788", "93568"]],
min_date = "2022-01-01",
max_date = "2023-01-01",
state = "NE")
...directly translates to...
select careEpisodes
from JOURNEY_TABLE
where proc_code in ("99453","99454","99457","99458")
and proc_code in ("92953", "93271", "93788", "93568")
and min_date = "2022-01-01",
and max_date = "2023-01-01",
and state = "NE"
Specify AND logic WITHIN categorically related parameters (with nested list auto greedy flag)
The creation of a nested list automatically flips the internal 'greedy' flag to False, and therefore applies AND logic to all other categorically related parameters, insofar as:
# nebraskan episodes of care with RPM, cardiovascular and surgery in 2022
results = cq.careEpisode(proc_code = [["99453","99454","99457","99458"],["92953", "93271", "93788", "93568"]],
proc_category = "surgery",
min_date = "2022-01-01",
max_date = "2023-01-01",
state = "NE")
...directly translates to...
select careEpisode
from JOURNEY_TABLE
where proc_code in ("99453","99454","99457","99458")
and proc_code in ("92953", "93271", "93788", "93568")
and proc_category = "surgery"
and min_date = "2022-01-01",
and max_date = "2023-01-01",
and state = "NE"
Specify AND logic WITHIN categorically related parameters (with greedy flag)
Lastly, users have the ability to specify the AND logic WITHIN categorically related parameter solely with including a 'greedy' parameter within the query.
There are three 'greedy' flags within CareQuery journey-based queries: proc_greedy
, diag_greedy
, and specialty_greedy
. Here is an example:
# nebraskan episodes of care with RPM, cardiovascular and surgery in 2022
results = cq.careEpisode(proc_code = ["99453","99454","99457","99458"],
proc_category = "surgery",
proc_greedy = False, # turns all proc selections to AND
min_date = "2022-01-01",
max_date = "2023-01-01",
state = "NE")
...directly translates to...
select careEpisode
from JOURNEY_TABLE
where proc_code in ("99453","99454","99457","99458")
and proc_category = "surgery"
and min_date = "2022-01-01",
and max_date = "2023-01-01",
and state = "NE"
Logic CareQuery doesn't have yet
CareQuery currently does not allow for the NOT functionality within query parameters, leaving the ability to say not surgery or this but not that in the hands of our users post-query.
Logic Help & Best Practices
We do our best to keep you from executing query-breaking logic or receiving unintended query results. We recommend the frequent use of:
.estimate()
to understand the data size of the query (estimate by default).sample()
to get a peak at the data results.execute(show = True)
to view the logic of your query printed out in plain text (show by default)
Printing Query Logic
By default, when you create a query, the query logic will print in plain text:
# creating a query automatically prints query
query = cq.careEncounter(proc_code = ["99453","99454","99457","99458","99091"],
state = ["CO"],
min_date = "2022-01-01",
max_date = "2023-06-01",
limit = False)
Error messages that can guide the way
Error messages articulate exactly what went wrong with a query, whether it is a data type, parameter value or impossible dimensionality, you'll receive a nudge in the right direction.
For example, asking for encounters with two referring NPIs at once is not possible, thus it gently errors:
# encounters with two referring NPIs (not possible)
query = cq.careEncounter(proc_subcategory = ['medicine - adaptive behavior services',
'medicine - health and behavior assessment/intervention',
'eval. and mgmt. - psychiatric collaborative care management services',
'medicine - psychiatry'],
metro = ['duluth, mn-wi'],
ref_npi = [["1376592931"],["1689189300"]],
min_date = "2022-01-01",
max_date = "2023-06-01",
limit = False)
Updated 11 months ago