AI Model for complex structures #735

pietz · 2024-01-12T11:54:23Z

First check

I added a descriptive title to this issue.
I used the GitHub search to look for a similar issue and didn't find it.
I searched the Marvin documentation for this feature.

Describe the current behavior

The AI model works well for rather simple data schemas but when the complexity grows and output documents become longer, it starts to fall apart.

The first stage I observed is that it starts "quoting" content from the original document. Changing date formats or summarizing text don't work anymore because the model starts to do copy & paste. The second stage when it gets even more complex, is that you start getting validation errors and the model becomes unusable.

The length limitation is also a problem in some of my use cases. Generating a list of 50 elements might hit the output length limit, which is too bad. I started experimenting with taking complex or long models apart in multiple smaller and simpler models. This works so well, that I was able to convert all my GPT-4 use cases to GPT-3.5.

I would like to suggest a change that is able to handle recursive AI model calls. At the moment you make your top-level model the AI model with a decorator, which treats all underlying pydantic models at the same time. I think it would be great if I could use the same decorator for some of these underlying models, in which case marvin would split it's LLM call into multiple calls.

That way you could dynamically experiment with different levels of complexity until it works to your liking. Furthermore this behavior can be executed asynchronously which greatly accelerates the response time.

Describe the proposed behavior

class Job(BaseModel):
    """A Job or position extracted from the resume"""
    position: str = Field(..., description="Name of the position")
    company: str = Field(..., description="Company name")
    start_date: str = Field(None, description="Start date of the job")
    end_date: str = Field(None, description="End date of the job or 'Present'")
    top_keywords: list[str] = Field(None, description="List of max. top 10 keywords, skills and technologies used for the job")

class Degree(BaseModel):
    """Degree or other type of education extracted from the resume"""
    name: str = Field(..., description="Name of the degree and field of study")
    institution: str = Field(None, description="University name")
    start_date: str = Field(None, description="Start date of the studies")
    end_date: str = Field(None, description="End date of the studies")

@marvin.ai_model(client=client, temperature=0.0)
class Resume(BaseModel):
    """Resume data extracted from the resume"""
    name: str = Field(..., description="The name of the person")
    email: str = Field(None, description="Email address of the person")
    phone: str = Field(None, description="Phone number of the person")
    location: str = Field(None, description="Current residence of the person")
    websites: str = Field(None, description="Website like LinkedIn, GitHub, Behance, etc.")
    work_experience: list[Job] = None
    education: list[Degree] = None
    skills: list[str] = Field(None, description="List of core skills and technologies")
    languages: list[str] = Field(None, description="List of languages spoken by the person")

This model would likely be too complex to work well in the real world. What if we could give the ai_model decorator also to the other 2 models and by calling the final Resume model, marvin would split the request into 3 pieces.

Example Use

No response

Additional context

What do you think?

zzstoatzz · 2024-01-17T22:58:38Z

hi @pietz

taking complex or long models apart in multiple smaller and simpler models

I have also found a lot of success with this strategy

this is an interesting idea

handle recursive AI model calls

on first thought I'm inclined to say that I would want to write a util myself that implements something like a recursive pattern to fill out a parent and then child models. I'd be interested to see any sketches you have on implementation!

pietz · 2024-01-21T11:01:07Z

The cleanest API from a user perspective would be that if a "sub model" is not a direct instance of BaseModel but marvin.Model instead, the calls would be split into multiple recursive calls.

Consider this example:

class Address(BaseModel):
    street: str
    zip: str
    state: str

class User(marvin.Model):
    name: str
    email: str
    address: Address

This would lead to a normal call like it's implemented at the moment. However...

class Address(marvin.Model):
    street: str
    zip: str
    state: str

class User(marvin.Model):
    name: str
    email: str
    address: Address

...would lead to two calls. Address would be called standalone first. Programmatically because it only consists non-BaseModel types. It would then return the parsed address which is assigned to the address variable in User. Next, the user model would be called without address, because that value already exists.

What I'm asking for is a major functionality because with this comes also the idea of async calls, which I will open another issue for.

What do you think?

zzstoatzz · 2024-01-30T21:35:20Z

i like this idea! I have a rough sense of how we can do this and I can explore some implementations soon

pietz · 2024-01-31T09:01:13Z

@zzstoatzz What are some ways of helping Marvin development at the moment? I'm using this library a lot and I would like to contribute instead of asking for features all the time.

zzstoatzz · 2024-02-04T18:47:29Z

we are open to all types of contributions - feel free to open a draft PR that achieves what you want to see and then we can chat on that!

pietz added the enhancement New feature or request label Jan 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Model for complex structures #735

AI Model for complex structures #735

pietz commented Jan 12, 2024

zzstoatzz commented Jan 17, 2024 •

edited

Loading

pietz commented Jan 21, 2024

zzstoatzz commented Jan 30, 2024 •

edited

Loading

pietz commented Jan 31, 2024

zzstoatzz commented Feb 4, 2024

AI Model for complex structures #735

AI Model for complex structures #735

Comments

pietz commented Jan 12, 2024

First check

Describe the current behavior

Describe the proposed behavior

Example Use

Additional context

zzstoatzz commented Jan 17, 2024 • edited Loading

pietz commented Jan 21, 2024

zzstoatzz commented Jan 30, 2024 • edited Loading

pietz commented Jan 31, 2024

zzstoatzz commented Feb 4, 2024

zzstoatzz commented Jan 17, 2024 •

edited

Loading

zzstoatzz commented Jan 30, 2024 •

edited

Loading