Skip to main content

Primary Features

Primary features are first-level computed metrics derived from raw data. They identify meaningful patterns and periods within raw sensor streams. The following primary features are available.

Output Structure​

Primary features return bouts or periods:

{
"data": [
{"start": <ms>, "end": <ms>, "duration": <ms>, ...},
...
],
"has_raw_data": 0 or 1
}

The has_raw_data flag distinguishes between "no data exists" (0) and "data exists but the computed result is empty" (1).

Summary​

FeatureRaw DependencyOutputDownstream
acc_jerkaccelerometerJerk values (m/s3)inactive_duration
game_level_scores6 game raw featuresPer-level scoresgame_results
screen_activedevice_stateScreen-on boutsscreen_duration, inactive_duration
significant_locationsgpsLocation clustershometime, entropy
survey_scoressurveyCategory scoressurvey_results
tripsgpsMovement tripstrip_distance, trip_duration

Accelerometer Jerk (acc_jerk)​

Computes jerk (rate of change of acceleration) from accelerometer data. Jerk is calculated as the magnitude of the acceleration derivative: sqrt((dx/dt)Β² + (dy/dt)Β² + (dz/dt)Β²).

Raw dependency: accelerometer

Parameters:

ParameterTypeDefaultDescription
thresholdint500Max time gap (ms) between consecutive points. Gaps larger than this are excluded.

Output fields:

FieldTypeDescription
startintWindow start timestamp (ms)
endintWindow end timestamp (ms)
acc_jerkfloatJerk value (m/sΒ³)

Downstream: Used by inactive_duration to detect stillness.


Game Level Scores (game_level_scores)​

Extracts per-level performance scores from cognitive game activity events.

Raw dependencies: balloon_risk, cats_and_dogs, jewels_a, jewels_b, pop_the_bubbles, spatial_span

Parameters:

ParameterTypeDescription
name_of_gamestringRequired. One of: jewels_a, jewels_b, balloon_risk, cats_and_dogs, pop_the_bubbles, spatial_span

Output fields (vary by game):

FieldDescriptionGames
levelLevel numberAll
avg_tap_timeAverage time between taps (ms)All
perc_correctPercentage of correct responsesAll
jewels_collectedNumber of jewels collectedJewels A/B
avg_go_perc_correctGo trial accuracyPop The Bubbles
avg_NO_go_perc_correctNo-go trial accuracyPop The Bubbles
avg_pumpsAverage pumps per balloonBalloon Risk

Downstream: Used by game_results.


Screen Active (screen_active)​

Identifies periods when the device screen was actively on, computed from device state events. Returns bouts of screen activity with start/end times.

Raw dependency: device_state

Parameters:

ParameterTypeDefaultDescription
duration_thresholdint7200000 (2 hr)Maximum allowable bout duration (ms). Bouts exceeding this are filtered out.

Algorithm:

  • Maps state transitions: on-events = [0 (screen_on), 3 (unlocked)], off-events = [1 (screen_off), 2 (locked)]
  • Detects state changes to define screen-on bouts
  • Filters consecutive near-identical events (less than 1 sec apart) and bouts exceeding threshold
  • Validates against first activity timestamp for correct mapping

Output fields:

FieldTypeDescription
startintBout start timestamp (ms)
endintBout end timestamp (ms)
durationintBout duration (ms)

Downstream: Used by screen_duration and inactive_duration.


Significant Locations (significant_locations)​

Identifies significant locations from GPS data using spatial clustering algorithms. A significant location is a place where the participant spends substantial time.

Raw dependency: gps

Parameters:

ParameterTypeDefaultDescription
methodstring"mode"Clustering method: "mode" (frequency-based) or "k_means"
k_maxint10Maximum clusters to test (k-means only)
epsfloat1e-5DBSCAN epsilon (k-means only)
min_cluster_sizefloat0.01Minimum cluster size as fraction of total points
max_distint300Maximum distance (meters) between clusters to merge
max_clustersint-1If -1, use min_cluster_size; otherwise limit to this count

Algorithm (mode method):

  1. Rounds lat/long to 3 decimal places
  2. Counts point frequency per rounded location
  3. Selects top locations above min_cluster_size threshold
  4. Merges clusters within max_dist meters (Haversine distance)

Output fields:

FieldTypeDescription
latitudefloatCluster centroid latitude
longitudefloatCluster centroid longitude
rankint0 = most visited (typically home)
radiusfloatMean distance from centroid (meters)
proportionfloatFraction of total time at this location (0–1)
durationintTime spent at location (ms)

Downstream: Used by hometime and entropy.


Survey Scores (survey_scores)​

Computes aggregate scores from survey responses using a configurable scoring dictionary.

Raw dependency: survey

Parameters:

ParameterTypeDefaultDescription
scoring_dictdict{}Maps questions to categories and scoring rules (see below)
return_ind_quesboolFalseReturn individual question scores in addition to category totals

Scoring dictionary format:

scoring_dict = {
"category_list": ["anxiety", "depression"],
"questions": {
"I feel nervous": {"category": "anxiety", "scoring": "value"},
"I feel sad": {"category": "depression", "scoring": "boolean"}
}
}

Scoring types: "value" (cast to int), "boolean" (Yes→1, No→0), "raw" (no transform), or a custom mapping name.

Output fields:

FieldTypeDescription
startintSurvey start timestamp (ms)
endintSurvey end timestamp (ms)
categorystringScore category name
questionstringQuestion text or category name
scorenumberNumeric score

Downstream: Used by survey_results.


Trips (trips)​

Identifies movement trips from GPS data based on speed thresholds.

Raw dependency: gps

Algorithm:

  • Speed threshold: 10 km/h β€” points below this are "stationary"
  • Time threshold: 600 seconds (10 min) β€” gaps larger than this break a trip
  • Detects stationary β†’ moving transitions (trip start) and moving β†’ stationary (trip end)
  • Uses Haversine formula for great-circle distance

Output fields:

FieldTypeDescription
startintTrip start timestamp (ms)
endintTrip end timestamp (ms)
latitudefloatTrip location
longitudefloatTrip location
distancefloatTotal distance traveled (km)

Downstream: Used by trip_distance and trip_duration.


Usage​

Primary features can be called directly or computed via cortex.run():

import cortex

# Direct call
result = cortex.primary.significant_locations.significant_locations(
id="U1234567890", start=start_time, end=end_time
)

# Via cortex.run()
result = cortex.run("U1234567890", features=["significant_locations"])