Advanced Query Logic

Interact with any one of our journey-based queries for a while, and you'll find that you can start ask some pretty complicated questions. We have thoughtfully built the CareQuery API to meet the SQL-like needs of complex data requests within the context of an API SDK.

To accomplish this capability, we needed to make the logic straightforward, consistent and natural to the expertise, hypothesis and questions of our users.

Parameter Category Options

CareQuery users should able to ask natural language questions in flow, and not spend time punching in codes. In order to achieve this, we have curated a handful of categorically related parameters that all map to the same set of underlying codes.

category	parameters	underlying codes
diagnosis	diag_code short_diag_code diag_subcategory diag_category	ICD-10 Code
procedure	proc_code proc_subcategory proc_category	CPT/HCPCS Code
specialty	taxonomy subspecialty specialty specialty_category	Taxonomy Code
geography	region division state metro short_zip_code	State Code or 3-Digit Zip

As a result, CareQuery users can simplify queries, iterate faster, and focus on solving problems, not retrieving data. CareQuery makes it easy to explore these parameter values and quickly wield them within a query:

# 1) import package
from care_query.care_query import CareQuery

# 2) instantiate and connect
cq = CareQuery(email = "your-email", 
               token = "your-api-token",
               sftp_key = "path/to/your/company.PEM")

# list parameter options
cq.glossary.list_options("diag_subcategory", contains = "pedal cyclist injured in collision")

# 3) build query
bike_ouch = cq.careVisit(diag_subcategory = ['pedal cyclist injured in collision with car, pick-up truck or van',
                                             'pedal cyclist injured in collision with fixed or stationary object',
                                             'pedal cyclist injured in collision with heavy transport vehicle or bus',
                                             'pedal cyclist injured in collision with other nonmotor vehicle',
                                             'pedal cyclist injured in collision with other pedal cycle',
                                             'pedal cyclist injured in collision with pedestrian or animal',
                                             'pedal cyclist injured in collision with railway train or railway vehicle',
                                             'pedal cyclist injured in collision with two- or three-wheeled motor vehicle'],
                         region = ["west"],
                         specialty = 'emergency medicine',
                         min_date = "2023-03-01",
                         max_date = "2023-06-01")

# 4) submit query, data will return to SFTP endpoint
		# could also run .sample() to return the first 1k rows  or .execute() to get a size estimate
result_object = bike_ouch.execute()

# 5) check query status
	# user will be sent an email upon submission and completion, check query status along the way
bike_ouch.queryStatus()

# 6) upon successful completion of query, read data directly into pandas environment
bike_ouch_data = bike_ouch.sftpToPandas()

The AND/OR Conundrum

Beyond the ease of use and removing the need to code punch, the largest challenge to CareQuery overcomes within the API context is the ability to specify AND and OR logic across (and within) the various parameters specified.

Doing so via the following parameter options and logical rules...

Default to OR logic WITHIN parameters

Any parameters for journey-based queries default to OR logic when values are placed within a list, therefore:

results = cq.careEncounter(proc_code = ["99453","99454","99457","99458"])

...directly translates to...

select	* 
from		careEncounters
where		proc_code in ("99453","99454","99457","99458")

Default to OR logic WITHIN parameter categories

Because all parameter categories map to the same underlying codes, this OR logic persists across any hierarchical parameters, therefore:

results = cq.careEncounter(proc_code = ["99453","99454","99457","99458"],
                           proc_subcategory = ["chemotherapy drugs","radiology - radiation oncology"])

...directly translates to...

select	careEncounters 
from		JOURNEY_TABLE
where		proc_code in ("99453","99454","99457","99458")
or			proc_subcategory in ("chemotherapy drugs","radiology - radiation oncology")

Default to AND logic ACROSS categorically unrelated parameters

Specification of categorically unrelated parameters are assumed to coincide with one anther, insofar as when you ask for a start date AND an end date AND a location, we can assume that you want data on a specific place within a time range, therefore:

# nebraskans RPM encounters in 2022
results = cq.careEncounter(proc_code = ["99453","99454","99457","99458"],
                           min_date = "2022-01-01",
                           max_date = "2023-01-01",
                           state = "NE")

...directly translates to...

select	careEpisodes
from		JOURNEY_TABLE
where		proc_code in ("99453","99454","99457","99458")
and			min_date = "2022-01-01",
and     max_date = "2023-01-01",
and     state = "NE"

Specify AND logic WITHIN categorically related parameters (with nested lists)

However, someone may want to execute a more complex query, one that entails AND statements WITHIN a specified parameters. One may accomplish this by creating a nested list as the parameter value, insofar as:

# nebraskan episodes of care with RPM and cardiovascular in 2022
results = cq.careEpisode(proc_code = [["99453","99454","99457","99458"],["92953", "93271", "93788", "93568"]],
                         min_date = "2022-01-01",
                         max_date = "2023-01-01",
                         state = "NE")

...directly translates to...

select	careEpisodes 
from		JOURNEY_TABLE
where		proc_code in ("99453","99454","99457","99458")
and			proc_code in ("92953", "93271", "93788", "93568")
and			min_date = "2022-01-01",
and     max_date = "2023-01-01",
and     state = "NE"

Specify AND logic WITHIN categorically related parameters (with nested list auto greedy flag)

The creation of a nested list automatically flips the internal 'greedy' flag to False, and therefore applies AND logic to all other categorically related parameters, insofar as:

# nebraskan episodes of care with RPM, cardiovascular and surgery in 2022
results = cq.careEpisode(proc_code = [["99453","99454","99457","99458"],["92953", "93271", "93788", "93568"]],
                         proc_category = "surgery",
                         min_date = "2022-01-01",
                         max_date = "2023-01-01",
                         state = "NE")

...directly translates to...

select	careEpisode
from		JOURNEY_TABLE
where		proc_code in ("99453","99454","99457","99458")
and			proc_code in ("92953", "93271", "93788", "93568")
and			proc_category = "surgery"
and			min_date = "2022-01-01",
and     max_date = "2023-01-01",
and     state = "NE"

Specify AND logic WITHIN categorically related parameters (with greedy flag)

Lastly, users have the ability to specify the AND logic WITHIN categorically related parameter solely with including a 'greedy' parameter within the query.

There are three 'greedy' flags within CareQuery journey-based queries: proc_greedy, diag_greedy, and specialty_greedy. Here is an example:

# nebraskan episodes of care with RPM, cardiovascular and surgery in 2022
results = cq.careEpisode(proc_code = ["99453","99454","99457","99458"],
                         proc_category = "surgery",
                         proc_greedy = False, # turns all proc selections to AND
                         min_date = "2022-01-01",
                         max_date = "2023-01-01",
                         state = "NE")

...directly translates to...

select	careEpisode
from		JOURNEY_TABLE
where		proc_code in ("99453","99454","99457","99458")
and			proc_category = "surgery"
and			min_date = "2022-01-01",
and     max_date = "2023-01-01",
and     state = "NE"

Logic CareQuery doesn't have yet

CareQuery currently does not allow for the NOT functionality within query parameters, leaving the ability to say not surgery or this but not that in the hands of our users post-query.

Logic Help & Best Practices

We do our best to keep you from executing query-breaking logic or receiving unintended query results. We recommend the frequent use of:

.estimate() to understand the data size of the query (estimate by default)
.sample() to get a peak at the data results
.execute(show = True) to view the logic of your query printed out in plain text (show by default)

Printing Query Logic

By default, when you create a query, the query logic will print in plain text:

# creating a query automatically prints query
query = cq.careEncounter(proc_code = ["99453","99454","99457","99458","99091"],
                         state = ["CO"],
                         min_date = "2022-01-01",
                         max_date = "2023-06-01",
                         limit = False)

Error messages that can guide the way

Error messages articulate exactly what went wrong with a query, whether it is a data type, parameter value or impossible dimensionality, you'll receive a nudge in the right direction.

For example, asking for encounters with two referring NPIs at once is not possible, thus it gently errors:

# encounters with two referring NPIs (not possible)
query = cq.careEncounter(proc_subcategory = ['medicine - adaptive behavior services', 
                                             'medicine - health and behavior assessment/intervention',
                                             'eval. and mgmt. - psychiatric collaborative care management services', 
                                             'medicine - psychiatry'],
                         metro = ['duluth, mn-wi'],
                         ref_npi = [["1376592931"],["1689189300"]],
                         min_date = "2022-01-01",
                         max_date = "2023-06-01",
                         limit = False)