Simulatrex Synthetic Audience Docs#
- class transforms.autobrew.trft.Subspace.PartitionConfig(group_defining_columns, train_data_defining_columns, partition_at)#
Bases:
objectCreates a structured schema for a subpartition training.
- Parameters:
group_defining_columns (list) – Names of the columns that define a group member
train_data_defining_columns (list[dict]) – Schema of the data columns to train on (that are not user defining). This can be question/ reply pairs where column name is the question and col. value is the reply.
partition_at (dict) – A parameter (existing column) that splits the groups. Must be a value of group_defining_columns
Examples
>>> # Defines the group (e.g. demographically split users) >>> group_defining_columns = ['name', 'gender', 'monthly_income']
>>> # Defines the training data schema >>> train_data_defining_columns = [{ >>> 'columName': 'how much do you spend on skincare products monthly', >>> 'dataType': 'string', >>> 'valueType': 'enum', >>> 'validOptions': ['option a', 'option b', 'option c'] >>> },{ >>> 'columnName': 'where do you shop for skincare products?', >>> 'dataType': 'string', >>> 'valueType': 'multipleChoice', >>> 'validOptions': ['Brand websites', 'Department stores', 'option c'] >>> }, >>> { >>> 'columnName': 'tell me something about yourself', >>> 'dataType': 'string', >>> 'valueType': 'freeText', >>> }, >>> ]
>>> # Divide data into groups >>> partition_at = { >>> 'column': 'monthly_income', >>> 'dataType': 'int', >>> 'groups': [{ >>> 'name': 'low_spenders', >>> 'lower_bound': 0, >>> 'upper_bound': 50, >>> },{ >>> 'name': 'mid_spenders', >>> 'lower_bound': 51, >>> 'upper_bound': 150, >>> },{ >>> 'name': 'high_spenders', >>> 'lower_bound': 151, >>> 'upper_bound': inf, >>> }] >>> }
>>> ### Groups can be partitioned with a lower and upper bound (see above). However when working with string data that cannot be directly compared, we need to define all options in a group: >>> partition_at = { >>> 'column': 'monthly_income', >>> 'dataType': 'string', >>> 'groups': [{ >>> 'name': 'low_spenders', >>> 'members': ['<5€', '5€-25€', '26€ - 50€'], >>> },{ >>> 'name': 'mid_spenders', >>> 'members': ['51€ - 100€', '100€ - 150€'], >>> },{ >>> 'name': 'high_spenders', >>> 'members': ['above 150€'], >>> }] >>> }
Important
userDefiningcolumns and partition_at depend on each other.The partition_at column must be present inside userDefiningcolumns and the definition in groups must represent the data passed to the trainer.
- class transforms.autobrew.trft.Subspace.Subspace(config: PartitionConfig, rank: int = 8, via_ranges=False, via_explicit=True)#
Bases:
objectDivides multiple sub groups into partions of the trained layer
- Parameters:
partition_config (transforms.autobrew.trft.Subspace.PartitionConfig) – A config representing the subspace partitioning scheme
rank (int) – The rank of the trained layer.
- create_subspace_partition(via_ranges=False, via_explicit=False) list#
Creates a subspace partition list targeting layers Either defines the layers as layer partition ranges [[0, 384], [384, 768]] or explicitly names the partitions [[[0,1,3,4]]]
- Parameters:
via_ranges (bool) – Wether or not defining via a range (s. above)
via_explicit (bool) – Wether or not defining partitions explicitly (s. above)
Note
Either via_ranges or via_explicit need to be True
- get_partition(column_name, key='name', groups='groups')#
Returns the elements for this named partition
Example
>>> subspace.get_partition('low_spenders') >>> #[0,1,2,3]
- mix_subspaces()#