This project combines computing technology with personal natural language datasets such as emails, text messages, diary entries, and voice recordings.
Below are Python Notebooks for:
- exporting datasets from native application formats (e.g., Gmail mbox, Android SMS xml) into stable forms (e.g., txt files) that may outlive the application and allow further processing
- basic data mining and processing by large language models to identify recurring topics and communication patterns
- tuning large language models to these datasets
- prompting language models for personality and autobiographical information
- Description: A Python Notebook for exporting Gmail Mbox files and converting them to TXT and JSON formats.
- Description: A Python Notebook for converting SMS XML files to TXT, CSV, and JSON formats.
- Description: A Python Notebook for transcribing audio files and filtering audio and transcripts for specific content.
- Description: A Python Notebook for labeling SMS (Short Message Service) conversations with OpenAI's API.
- Description: A Python Notebook for labeling journal entries with OpenAI's API.
- Description: A Python Notebook for fine-tuning a GPT-3.5 Turbo model with OpenAI's API using the role-based system/user/assistant format.
- Description: A Python Notebook implementing conversations with a fine-tuned GPT-3.5 Turbo model.
- Description: A Python Notebook for conducting a questionnaire related to the 16 Personality Factors model on a Large Language Model.
- Description: A Python Notebook containing prompts for autobiographical writing.