Skip to main content

Advanced Usage

Cortex cache

Since passive data can be large (GPS can be sampled at 1 Hz and accelerometer at 5 Hz) some Cortex functions will take time to run. However, you may want to repeat calculations on a participant's raw data. Therefore, we have created two methods to save data locally to accelerate computation.

  1. Attachments are used in primary features for saving intermediate computations or values. This may be useful if you develop features for Cortex.
  2. The cache is used to save raw data to accelerate the analysis. The default is cache=False to prevent unwanted data collecting on your machine. If you set cache=True you will need to set the environment variable "CORTEX_CACHE_DIR" to the path to your directory. Then, raw data will be saved there.

Case Example: Anomaly Detection

It's easy to get started with more advanced analysis on data collected from mindLAMP. In this example, we'll walk through using Cortex with Luminol, an anomaly detection library, and Altair, an interactive visualization library, to tag and visualize survey scores for a particular patient.

info

This sample code is not intended to be used in clinical practice. All data and visualizations provided here are examples only and not linked to actual patients in any way, shape, or form. Tap here to directly view the Jupyter Notebook for this example.

import cortex
import luminol
import pandas as pd
import numpy as np
import altair as alt
from luminol.anomaly_detector import AnomalyDetector

Preparing the data using Cortex

First, call cortex.run(...) with your Participant ID of interest. Then, we'll need to rearrange the resultant data frame by setting the index to the timestamp and adding an anomaly column for later.

df = cortex.run(
'U1089294357', ['survey_scores'],
start=0, end=cortex.now()
)['survey_scores']
df.index = df.timestamp.astype(int) // 10**3
df['anomaly'] = 0 # default to no anomaly
[INFO:feature_types:_wrapper2] Processing primary feature "cortex.survey_scores"...
[INFO:feature_types:_wrapper2] Cortex caching directory set to: /home/_data/cortex_cache
[INFO:feature_types:_wrapper2] Processing raw feature "lamp.survey"...
[INFO:feature_types:_wrapper2] No saved raw data found, getting new...
[INFO:feature_types:_wrapper2] Saving raw data as "/home/_data/cortex_cache/survey_U1089294357_0_1621449536000.cortex"...

In addition to the survey score column, we also have a category column that's derived from custom survey grouping. The Cortex feature survey_scores automatically scores each question for you, whether it's a Likert scale, list of options, True/False, and so on. Then, it groups together questions from a single survey, such as "Weekly Survey" by predefined categories, like "Mood" and "Anxiety" to better understand symptom domains.

Detecting anomalies using Luminol

Now, we feed the Luminol detector our score column. It then processes the data and returns anomalous time windows tagged with an anomaly score. We'll tag the actual survey scores in our DataFrame that lie within these windows with their respective anomaly score. We need to iterate over each category and tag anomalies within the category independent of survey scores from other categories.

for cat in df.category.unique():
sub_df = df.loc[df.category == cat, 'score'].fillna(0).to_dict()
detector = AnomalyDetector(sub_df, score_threshold=1.5)
for a in detector.get_anomalies():
ts = (df.index >= a.start_timestamp) & (df.index <= a.end_timestamp)
df.loc[ts & (df.category == cat), 'anomaly'] = a.anomaly_score

Visualizing the anomalies using Altair

We'll use the Altair interactive plotting library to break question categories out into their own sub-charts. We'll also bring extra attention to anomalous survey score data points by increasing their size and changing their color.

alt.Chart(df).mark_point(filled=True).properties(width=500, height=50).encode(

# The timestamp column was already converted by Cortex into a human-readable Date.
x=alt.X('timestamp', title="Date"),

# We know the score is clamped between [1 <= score <= 3] for this patient.
y=alt.Y('score', title="Score", scale=alt.Scale(domain=[1, 3])),

# Color anomalies non-linearly by severity (redder is worse).
color=alt.Color('anomaly', title='Severity', scale=alt.Scale(type='sqrt', range=['#29629E', '#CA2C21'])),

# Resize anomalies non-linearly by severity (larger is worse).
size=alt.Size('anomaly', title='Severity', scale=alt.Scale(type='sqrt', range=[25, 500]))
).facet(

# By 'faceting' the plot by the category column, we can split each survey category out into its own subplot.
row='category'
)
category123ScoreAnxiety123ScoreApp Usability123ScoreMood123ScorePsychosis and Social123ScoreSleep and SocialSep 02Sep 09Sep 16Sep 23Sep 30Oct 07Oct 14Oct 21Oct 28Nov 04Nov 11Nov 18Nov 25Dec 02Dec 09Dec 16Dec 23Date0123Severity

Additional Usage

Configuration

Ensure your server_address is set correctly. If using the default server, it will be api.lamp.digital. Keep your access_key (sometimes an email address) and secret_key (sometimes a password) private and do not share them with others. While you are able to set these parameters as arguments to the cortex executable, it is preferred to set them as session-wide environment variables. You can also run the script from the command line:

LAMP_SERVER_ADDRESS=api.lamp.digital LAMP_ACCESS_KEY=XXX LAMP_SECRET_KEY=XXX python3 -m \
cortex significant_locations \
--id=U26468383 \
--start=1583532346000 \
--end=1583618746000 \
--k_max=9

Or another example using the CLI arguments instead of environment variables (and outputting to a file):

python3 -m \
cortex --format=csv --server-address=api.lamp.digital --access-key=XXX --secret-key=XXX \
survey --id=U26468383 --start=1583532346000 --end=1583618746000 \
2>/dev/null 1>./my_cortex_output.csv

Example

# environment variables must already contain LAMP configuration info
from pprint import pprint
from cortex import all_features, significant_locations, trips
pprint(all_features())
for i in range(1583532346000, 1585363115000, 86400000):
pprint(significant_locations(id="U26468383", start=i, end=i + 86400000))