Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Realtime API types + example #276

Merged
merged 7 commits into from
Oct 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions async-openai/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ rustls-webpki-roots = ["reqwest/rustls-tls-webpki-roots"]
native-tls = ["reqwest/native-tls"]
# Remove dependency on OpenSSL
native-tls-vendored = ["reqwest/native-tls-vendored"]
realtime = ["dep:tokio-tungstenite"]

[dependencies]
backoff = { version = "0.4.0", features = ["tokio"] }
Expand All @@ -46,6 +47,11 @@ async-convert = "1.0.0"
secrecy = { version = "0.8.0", features = ["serde"] }
bytes = "1.6.0"
eventsource-stream = "0.2.3"
tokio-tungstenite = { version = "0.24.0", optional = true, default-features = false }

[dev-dependencies]
tokio-test = "0.4.4"

[package.metadata.docs.rs]
all-features = true
rustdoc-args = ["--cfg", "docsrs"]
6 changes: 6 additions & 0 deletions async-openai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
- [x] Models
- [x] Moderations
- [ ] Organizations | Administration
- [x] Realtime API types (Beta)
- [ ] Uploads
- SSE streaming on available APIs
- Requests (except SSE streaming) including form submissions are retried with exponential backoff when [rate limited](https://platform.openai.com/docs/guides/rate-limits).
Expand All @@ -58,6 +59,11 @@ $Env:OPENAI_API_KEY='sk-...'
- Visit [examples](https://github.com/64bit/async-openai/tree/main/examples) directory on how to use `async-openai`.
- Visit [docs.rs/async-openai](https://docs.rs/async-openai) for docs.

## Realtime API

Only types for Realtime API are imlemented, and can be enabled with feature flag `realtime`
These types may change when OpenAI releases official specs for them.

## Image Generation Example

```rust
Expand Down
1 change: 1 addition & 0 deletions async-openai/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@
//! ## Examples
//! For full working examples for all supported features see [examples](https://github.com/64bit/async-openai/tree/main/examples) directory in the repository.
//!
#![cfg_attr(docsrs, feature(doc_cfg))]
mod assistant_files;
mod assistants;
mod audio;
Expand Down
3 changes: 3 additions & 0 deletions async-openai/src/types/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ mod message;
mod message_file;
mod model;
mod moderation;
#[cfg_attr(docsrs, doc(cfg(feature = "realtime")))]
#[cfg(feature = "realtime")]
pub mod realtime;
mod run;
mod step;
mod thread;
Expand Down
220 changes: 220 additions & 0 deletions async-openai/src/types/realtime/client_event.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,220 @@
use serde::{Deserialize, Serialize};
use tokio_tungstenite::tungstenite::Message;

use super::{item::Item, session_resource::SessionResource};

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct SessionUpdateEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,
/// Session configuration to update.
pub session: SessionResource,
}

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct InputAudioBufferAppendEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,
/// Base64-encoded audio bytes.
pub audio: String,
}

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct InputAudioBufferCommitEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,
}

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct InputAudioBufferClearEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,
}

#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct ConversationItemCreateEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,

/// The ID of the preceding item after which the new item will be inserted.
#[serde(skip_serializing_if = "Option::is_none")]
pub previous_item_id: Option<String>,

/// The item to add to the conversation.
pub item: Item,
}

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct ConversationItemTruncateEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,

/// The ID of the assistant message item to truncate.
pub item_id: String,

/// The index of the content part to truncate.
pub content_index: u32,

/// Inclusive duration up to which audio is truncated, in milliseconds.
pub audio_end_ms: u32,
}

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct ConversationItemDeleteEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,

/// The ID of the item to delete.
pub item_id: String,
}

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct ResponseCreateEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,

/// Configuration for the response.
pub response: Option<SessionResource>,
}

#[derive(Debug, Serialize, Deserialize, Clone, Default)]
pub struct ResponseCancelEvent {
/// Optional client-generated ID used to identify this event.
#[serde(skip_serializing_if = "Option::is_none")]
pub event_id: Option<String>,
}

/// These are events that the OpenAI Realtime WebSocket server will accept from the client.
#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "type")]
pub enum ClientEvent {
/// Send this event to update the session’s default configuration.
#[serde(rename = "session.update")]
SessionUpdate(SessionUpdateEvent),

/// Send this event to append audio bytes to the input audio buffer.
#[serde(rename = "input_audio_buffer.append")]
InputAudioBufferAppend(InputAudioBufferAppendEvent),

/// Send this event to commit audio bytes to a user message.
#[serde(rename = "input_audio_buffer.commit")]
InputAudioBufferCommit(InputAudioBufferCommitEvent),

/// Send this event to clear the audio bytes in the buffer.
#[serde(rename = "input_audio_buffer.clear")]
InputAudioBufferClear(InputAudioBufferClearEvent),

/// Send this event when adding an item to the conversation.
#[serde(rename = "conversation.item.create")]
ConversationItemCreate(ConversationItemCreateEvent),

/// Send this event when you want to truncate a previous assistant message’s audio.
#[serde(rename = "conversation.item.truncate")]
ConversationItemTruncate(ConversationItemTruncateEvent),

/// Send this event when you want to remove any item from the conversation history.
#[serde(rename = "conversation.item.delete")]
ConversationItemDelete(ConversationItemDeleteEvent),

/// Send this event to trigger a response generation.
#[serde(rename = "response.create")]
ResponseCreate(ResponseCreateEvent),

/// Send this event to cancel an in-progress response.
#[serde(rename = "response.cancel")]
ResponseCancel(ResponseCancelEvent),
}

impl From<&ClientEvent> for String {
fn from(value: &ClientEvent) -> Self {
serde_json::to_string(value).unwrap()
}
}

impl From<ClientEvent> for Message {
fn from(value: ClientEvent) -> Self {
Message::Text(String::from(&value))
}
}

macro_rules! message_from_event {
($from_typ:ty, $evt_typ:ty) => {
impl From<$from_typ> for Message {
fn from(value: $from_typ) -> Self {
Self::from(<$evt_typ>::from(value))
}
}
};
}

macro_rules! event_from {
($from_typ:ty, $evt_typ:ty, $variant:ident) => {
impl From<$from_typ> for $evt_typ {
fn from(value: $from_typ) -> Self {
<$evt_typ>::$variant(value)
}
}
};
}

event_from!(SessionUpdateEvent, ClientEvent, SessionUpdate);
event_from!(
InputAudioBufferAppendEvent,
ClientEvent,
InputAudioBufferAppend
);
event_from!(
InputAudioBufferCommitEvent,
ClientEvent,
InputAudioBufferCommit
);
event_from!(
InputAudioBufferClearEvent,
ClientEvent,
InputAudioBufferClear
);
event_from!(
ConversationItemCreateEvent,
ClientEvent,
ConversationItemCreate
);
event_from!(
ConversationItemTruncateEvent,
ClientEvent,
ConversationItemTruncate
);
event_from!(
ConversationItemDeleteEvent,
ClientEvent,
ConversationItemDelete
);
event_from!(ResponseCreateEvent, ClientEvent, ResponseCreate);
event_from!(ResponseCancelEvent, ClientEvent, ResponseCancel);

message_from_event!(SessionUpdateEvent, ClientEvent);
message_from_event!(InputAudioBufferAppendEvent, ClientEvent);
message_from_event!(InputAudioBufferCommitEvent, ClientEvent);
message_from_event!(InputAudioBufferClearEvent, ClientEvent);
message_from_event!(ConversationItemCreateEvent, ClientEvent);
message_from_event!(ConversationItemTruncateEvent, ClientEvent);
message_from_event!(ConversationItemDeleteEvent, ClientEvent);
message_from_event!(ResponseCreateEvent, ClientEvent);
message_from_event!(ResponseCancelEvent, ClientEvent);

impl From<Item> for ConversationItemCreateEvent {
fn from(value: Item) -> Self {
Self {
event_id: None,
previous_item_id: None,
item: value,
}
}
}
18 changes: 18 additions & 0 deletions async-openai/src/types/realtime/content_part.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize, Clone)]
#[serde(tag = "type")]
pub enum ContentPart {
#[serde(rename = "text")]
Text {
/// The text content
text: String,
},
#[serde(rename = "audio")]
Audio {
/// Base64-encoded audio data
audio: Option<String>,
/// The transcript of the audio
transcript: String,
},
}
10 changes: 10 additions & 0 deletions async-openai/src/types/realtime/conversation.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct Conversation {
/// The unique ID of the conversation.
pub id: String,

/// The object type, must be "realtime.conversation".
pub object: String,
}
19 changes: 19 additions & 0 deletions async-openai/src/types/realtime/error.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize, Clone)]
pub struct RealtimeAPIError {
/// The type of error (e.g., "invalid_request_error", "server_error").
pub r#type: String,

/// Error code, if any.
pub code: Option<String>,

/// A human-readable error message.
pub message: String,

/// Parameter related to the error, if any.
pub param: Option<String>,

/// The event_id of the client event that caused the error, if applicable.
pub event_id: Option<String>,
}
Loading