Skip to main content

Cortex Quick Start Guide

Setting up Cortex

You will need Python 3.4+ and pip installed in order to use Cortex.

  • You may need root permissions, using sudo.
  • Alternatively, to install locally, use pip --user.
  • If pip is not recognized as a command, use python3 -m pip.

If you meet the prerequisites, install Cortex:

pip install lamp-cortex

If you do not have your environment variables set up (see Advanced section below), you will need to perform the initial server credentials configuraton below:

import os
os.environ['LAMP_ACCESS_KEY'] = 'YOUR_EMAIL_ADDRESS'
os.environ['LAMP_SECRET_KEY'] = 'YOUR_PASSWORD'
os.environ['LAMP_SERVER_ADDRESS'] = 'YOUR_SERVER_ADDRESS'

Source Code

The source code can be found here: https://github.com/BIDMCDigitalPsychiatry/LAMP-cortex

Example: Passive data features from Cortex

The primary function of Cortex is to provide a set of features derived from pasive data. Data can be pulled either by calling Cortex functions directly, or by using the cortex.run() function to parse multiple participants or features simultaneously. For example, one feature of interest is screen_duration or the time spent with the phone "on".

First, we can pull this data using the Cortex function. Let's say we want to compute the amount of time spent by participant: "U1234567890" from 11/15/21 (epoch time: 1636952400000) to 11/30/21 (epoch time: 1638248400000) each day (resolution = miliseconds in a day = 86400000):

import cortex
screen_dur = cortex.secondary.screen_duration.screen_duration("U1234567890", start=1636952400000, end=1638248400000, resolution=86400000)

The output would look something like this:

{'timestamp': 1636952400000,
'duration': 1296000000,
'resolution': 86400000,
'data': [{'timestamp': 1636952400000, 'value': 0.0},
{'timestamp': 1637038800000, 'value': 0.0},
{'timestamp': 1637125200000, 'value': 0.0},
{'timestamp': 1637211600000, 'value': 0.0},
{'timestamp': 1637298000000, 'value': 0.0},
{'timestamp': 1637384400000, 'value': 0.0},
{'timestamp': 1637470800000, 'value': 8425464},
{'timestamp': 1637557200000, 'value': 54589034},
{'timestamp': 1637643600000, 'value': 50200716},
{'timestamp': 1637730000000, 'value': 38500923},
{'timestamp': 1637816400000, 'value': 38872835},
{'timestamp': 1637902800000, 'value': 46796405},
{'timestamp': 1637989200000, 'value': 42115755},
{'timestamp': 1638075600000, 'value': 44383154}]}

The 'data' in the dictionary holds the start timestamps (of each day from 11/15/21 to 11/29/21) and the screen duration for each of these days.

Second, we could have pulled this same data using the cortex.run function. Note that resolution is automatically set to a day in cortex.run. To invoke cortex.run, you must provide a specific ID or a list of IDs (only Researcher, Study, or Participant IDs are supported). Then, you specify the behavioral features to generate and extract. Once Cortex finishes running, you will be provided a dict where each key is the behavioral feature name, and the value is a dataframe. You can use this dataframe to save your output to a CSV file, for example, or continue data processing and visualization. This function call would look like this:

import cortex
screen_dur = cortex.run("U1234567890", ['screen_duration'], start=1636952400000, end=1638248400000)

And the output might look like:

{'screen_duration':              id           timestamp       value
0 U1234567890 2021-11-15 05:00:00 0.0
1 U1234567890 2021-11-16 05:00:00 0.0
2 U1234567890 2021-11-17 05:00:00 0.0
3 U1234567890 2021-11-18 05:00:00 0.0
4 U1234567890 2021-11-19 05:00:00 0.0
5 U1234567890 2021-11-20 05:00:00 0.0
6 U1234567890 2021-11-21 05:00:00 8425464.0
7 U1234567890 2021-11-22 05:00:00 54589034.0
8 U1234567890 2021-11-23 05:00:00 50200716.0
9 U1234567890 2021-11-24 05:00:00 38500923.0
10 U1234567890 2021-11-25 05:00:00 38872835.0
11 U1234567890 2021-11-26 05:00:00 46796405.0
12 U1234567890 2021-11-27 05:00:00 42115755.0
13 U1234567890 2021-11-28 05:00:00 44383154.0}

The output is the same as above, except the 'data' has been transformed into a Pandas DataFrame. Additionally, the dictionary is indexed by feature -- this way you can add to the list of features processed at once. Finally, a column "id" has been added so that multiple participants can be processed simultaneously.

We can process two participants and add "entropy" to our feature list. Instead of 11/15-11/29/21, let's change the timeframe to the past 7 days. If you try this with your own participant IDs, it may take a moment to run:

import cortex
MS_IN_A_DAY = 1000 * 60 * 60 * 24 # The miliseconds in a day
features = cortex.run(["U1234567890", "U0011008800"], ['screen_duration', 'entropy'], start=cortex.now() - 7 * MS_IN_A_DAY, end=cortex.now())

Output:

{'screen_duration':             id           timestamp       value
0 U1234567890 2021-12-05 05:00:00 37035845.0
1 U1234567890 2021-12-06 05:00:00 53403478.0
2 U1234567890 2021-12-07 05:00:00 40274745.0
3 U1234567890 2021-12-08 05:00:00 46607703.0
4 U1234567890 2021-12-09 05:00:00 50506566.0
5 U1234567890 2021-12-10 05:00:00 45152245.0
0 U0011008800 2021-12-05 05:00:00 18144929.0
1 U0011008800 2021-12-06 05:00:00 49786516.0
2 U0011008800 2021-12-07 05:00:00 18542471.0
3 U0011008800 2021-12-08 05:00:00 18710925.0
4 U0011008800 2021-12-09 05:00:00 0.0
5 U0011008800 2021-12-10 05:00:00 0.0,
'entropy': id timestamp value
0 U1234567890 2021-12-05 05:00:00 -0.000000
1 U1234567890 2021-12-06 05:00:00 -0.000000
2 U1234567890 2021-12-07 05:00:00 -0.000000
3 U1234567890 2021-12-08 05:00:00 -0.000000
4 U1234567890 2021-12-09 05:00:00 0.491646
5 U1234567890 2021-12-10 05:00:00 -0.000000
0 U0011008800 2021-12-05 05:00:00 -0.000000
1 U0011008800 2021-12-06 05:00:00 -0.000000
2 U0011008800 2021-12-07 05:00:00 0.214006
3 U0011008800 2021-12-08 05:00:00 0.191434
4 U0011008800 2021-12-09 05:00:00 NaN
5 U0011008800 2021-12-10 05:00:00 NaN}

Example: Retrieving survey data from Cortex

We can also run the survey feature (which is not a behavioral feature, but rather a convenience around raw survey data) and save it to a csv using Pandas:

import cortex
cortex.run('YOUR_RESEARCHER_ID', ['survey'], start=0, end=cortex.now())['survey'].to_csv('~/export.csv', index=False)

Yielding the following CSV output:

id,timestamp,survey,item,value,type,duration,level
U123456789,2020-01-16 20:57:01,RA,RA Initials,test,,0,
U123456789,2020-01-16 20:56:50,SELF REPORT: Process of Recovery Questionnaire, I feel better about myself,Neither agree nor disagree,,0,
U123456789,2020-01-16 20:56:50,SELF REPORT: Process of Recovery Questionnaire,I feel able to take chances in life,Agree Strongly,,0,
U123456789,2020-01-16 20:56:50,SELF REPORT: Process of Recovery Questionnaire,I am able to develop positive relationships with other people ,Agree,,0,
U123456789,2020-01-16 20:56:50,SELF REPORT: Process of Recovery Questionnaire, I feel part of society rather than isolated,Neither agree nor disagree,,0,
U123456789,2020-01-16 20:56:50,SELF REPORT: Process of Recovery Questionnaire,I am able to assert myself,Disagree ,,0,
U123456789,2020-01-16 20:56:50,SELF REPORT: Process of Recovery Questionnaire,I feel that my life has a purpose ,Agree Strongly,,0,
U123456789,2020-01-16 20:56:50,SELF REPORT: Process of Recovery Questionnaire,My experiences have changed me for the better,Agree,,0,
U123456789,2020-01-16 20:56:50,SELF REPORT: Process of Recovery Questionnaire, I have been able to come to terms with things that have happened to me in the past and move on with my life,Disagree,,0,

You can then load this CSV file into Microsoft Excel (or Apple Numbers on macOS). Additionally, you can add Categories to group the data by ID, survey, and the specific time that the survey was taken.