-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AI Model for complex structures #735
Comments
hi @pietz
I have also found a lot of success with this strategy this is an interesting idea
on first thought I'm inclined to say that I would want to write a util myself that implements something like a recursive pattern to fill out a parent and then child models. I'd be interested to see any sketches you have on implementation! |
The cleanest API from a user perspective would be that if a "sub model" is not a direct instance of Consider this example: class Address(BaseModel):
street: str
zip: str
state: str
class User(marvin.Model):
name: str
email: str
address: Address This would lead to a normal call like it's implemented at the moment. However... class Address(marvin.Model):
street: str
zip: str
state: str
class User(marvin.Model):
name: str
email: str
address: Address ...would lead to two calls. What I'm asking for is a major functionality because with this comes also the idea of async calls, which I will open another issue for. What do you think? |
i like this idea! I have a rough sense of how we can do this and I can explore some implementations soon |
@zzstoatzz What are some ways of helping Marvin development at the moment? I'm using this library a lot and I would like to contribute instead of asking for features all the time. |
we are open to all types of contributions - feel free to open a draft PR that achieves what you want to see and then we can chat on that! |
First check
Describe the current behavior
The AI model works well for rather simple data schemas but when the complexity grows and output documents become longer, it starts to fall apart.
The first stage I observed is that it starts "quoting" content from the original document. Changing date formats or summarizing text don't work anymore because the model starts to do copy & paste. The second stage when it gets even more complex, is that you start getting validation errors and the model becomes unusable.
The length limitation is also a problem in some of my use cases. Generating a list of 50 elements might hit the output length limit, which is too bad. I started experimenting with taking complex or long models apart in multiple smaller and simpler models. This works so well, that I was able to convert all my GPT-4 use cases to GPT-3.5.
I would like to suggest a change that is able to handle recursive AI model calls. At the moment you make your top-level model the AI model with a decorator, which treats all underlying pydantic models at the same time. I think it would be great if I could use the same decorator for some of these underlying models, in which case marvin would split it's LLM call into multiple calls.
That way you could dynamically experiment with different levels of complexity until it works to your liking. Furthermore this behavior can be executed asynchronously which greatly accelerates the response time.
Describe the proposed behavior
This model would likely be too complex to work well in the real world. What if we could give the ai_model decorator also to the other 2 models and by calling the final Resume model, marvin would split the request into 3 pieces.
Example Use
No response
Additional context
What do you think?
The text was updated successfully, but these errors were encountered: