feat: added conservative tokenizer for Claude 3 #146

adubovik · 2024-09-06T16:03:32Z

Resolves #145

Byte counting approximation is used since the Claude 3 tokenizer isn't yet released.

…aude3

adubovik · 2024-09-10T11:35:00Z

/deploy-review

Environment URL: https://chat-ai-dial-adapter-bedrock-pr-146.nightly-test.deltixhub.io
E2E tests status: success

adubovik · 2024-09-10T16:25:17Z

/deploy-review

Environment URL: https://chat-ai-dial-adapter-bedrock-pr-146.nightly-test.deltixhub.io
E2E tests status: success

adubovik · 2024-09-12T16:10:23Z

See anthropics/anthropic-sdk-python#656 for discussion why Anthropic datatypes have Iterable inside which doesn't do well with Pydantic.

adubovik · 2024-09-16T08:37:56Z

See also anthropics/anthropic-sdk-python#375 raising the issue of missing tokenizer for Claude 3.

roman-romanov-o · 2024-09-16T09:06:10Z

aidial_adapter_bedrock/llm/chat_model.py

-    ) -> List[int]:
-        prompt = self.get_text_completion_prompt(params, messages)
-        return prompt.discarded_messages or []
+    ) -> List[int] | None:


I guess that's the same discussion, as we have here:

epam/ai-dial-adapter-openai#150 (comment)

My main argument for avoiding such -> List[int] | None in favour of "squeezed" -> SqueezedResult is:

If you look at result type of function, and see just primitive types, that gives you no information, what meaning that types hold

For example here - I need to go and double check, that List[int] means that we have some discarded messages indexes
(and this also assumes, that I know that we pass discarded messages as indexes of messages. If I would not know that, this List[int] would make no sense for me, before I actually go and see code)

And None means that we discarded nothing

Which is also kind of counterintuitive - why it's not just empy list then?

If this would be some "squeezed" type, like

class PromptTruncationResult(BaseModel): """ Indexes of discarded messages""" discarded_messages: List[int] | None = None

I would get sense of this function immediately, just by hints of my IDE

And None means that we discarded nothing
Which is also kind of counterintuitive - why it's not just empy list then?

None and emtpy list have different meanings:

None - the algo wasn't applied at all,

Empty list - the algo was applied and the messages turned out to fit into the limit.

Here None supports optionality of the truncate prompt algo application.

So PromptTruncationResult isn't prompt truncation result strictly speaking, it's OptionalPromptTruncationResult and PromptTruncationResult should be a wrapper for List[int] following the logic of your proposal.

I really don't like proliferation of trivial types like this. It's Python, not Java after all.

If you look at result type of function, and see just primitive types, that gives you no information, what meaning that types hold

Typically the name of the function provides a hint on the meaning of the result. Like

def concant_lists(a: List[T], b: List[T]) -> List[T] pass

It's clear what List[T] means. You don't have to wrap it into ConcatenationResult[T] class.

trucate_prompt is literally part of the DIAL API, it's not some obscure method - it's implemented in 3 DIAL packages at this point.

You provided great example for discussion

Because, for instance, if this function would be something like that

async def truncate_prompt( self, params: ModelParameters, messages: List[Message] ) ->List[Message]: ...

I would have no objections - here you can get on first glance, that by truncation we take list of input messages, and return modified list of messages

By naming of function itself I get clearly, that truncate - is shortening/shrinking something
And this typings help me with understanding

Now let's look at what we have:

async def truncate_prompt( self, params: ModelParameters, messages: List[Message] ) -> List[int] | None: ...

Ok, so I'm truncating prompt, that is list of messages, and I get ... list of integers??
What does this list of integer means? What does None means?

I really don't like proliferation of trivial types like this. It's Python, not Java after all.

If this function would be super-easy to understand, than I would have no objections.

We basically writing in language, that has

Explicit is better than implicit. ... Readability counts.

inside of its motto

This function do not truncate anything

It returns list of indexes, that we should remove from messages, in order for it to fit into max_prompt_tokens limit

Maybe we should rename this function then, because

async def get_indexes_to_remove( self, params: ModelParameters, remove_from: List[Message] ) -> List[int] | None

Renamed to compute_discarded_messages

Introduced DiscardedMessages = List[int] to improve readability

Added comments to explain what None means

roman-romanov-o · 2024-09-16T09:33:16Z

aidial_adapter_bedrock/llm/truncate_prompt.py

-    tokenize_messages: Callable[[Messages], int],
-    keep_message: Callable[[Messages, int], bool],
-    partition_messages: Callable[[Messages], List[int]],
+async def truncate_prompt(


This truncate_prompt and truncate_prompt_ setup looks twisted

Why can't we just make this operations inside of chat_model and adapter?

That's only 6 lines of code (3 of them are duplicated), but it's more explicit:

@override async def truncate_and_linearize_messages( self, messages: List[BaseMessage], max_prompt_tokens: Optional[int] ) -> TextCompletionPrompt: tuncate_result = await truncate_prompt( messages=messages, tokenize_messages=self.tokenize_messages, keep_message=keep_last_and_system_messages, partition_messages=self.partitioner, model_limit=None, user_limit=max_prompt_tokens, ) if isinstance(result, TruncatePromptError): raise result.to_dial_exception() messages = omit_by_indices(messages, tuncate_result)

What is now duplicated, will become triplicated and many-more-times-licated in future, when we add new models with truncate prompt supports. And we will.

Ok, can we at least get rid of this truncate_prompt and tuncate_prompt_?

Imgaine, that you're new to this repo, and you just scrolling codebase, and see two functions, with basically same naming, and different output types

You have two functions

async def truncate_prompt( ... ) -> Tuple[List[int], List[_T]]: ... async def truncate_prompt_( ... ) -> Set[int] | TruncatePromptError: ...

I can't imagine any situation, where you're not confused by this, without actually digging into each of this functions

Renamed truncate_prompt_ => compute_discarded_messages since it only returns a list of indices of discarded messages, whereas truncate_prompt does actually return a truncated list of messages.

roman-romanov-o · 2024-09-16T09:38:52Z

aidial_adapter_bedrock/llm/truncate_prompt.py

+    messages: List[_T],
+    tokenize_messages: Callable[[List[_T]], Awaitable[int]],
+    keep_message: Callable[[List[_T], int], bool],
+    partition_messages: Callable[[List[_T]], List[int]],


It's partitioner, if we're consistent with chain of calls

I don't follow the logic. Why should it be partitioner?

Here I called all callbacks as verbs by the action they do:

tokenize_messages - it tokenizes a list of messages,

keep_message - it marks a message as required (maybe I should rename it to is_message_required?),

partition_messages - it creates a partition of messages

Because chain of calls here is such:

class PseudoChatModel(...): partitioner: Callable[[List[BaseMessage]], List[int]] @override async def truncate_and_linearize_messages( self, messages: List[BaseMessage], max_prompt_tokens: Optional[int] ) -> TextCompletionPrompt: discarded_messages, messages = await truncate_prompt( ... partition_messages=self.partitioner, <------- )

Meaning of this function has not changed - it's a function, that takes list of elements, and returns list of partition sizes

Why we need to change this naming then?

Let's choose partition_messages or partitioner, and use it consistently

Fixed, renamed to partitioner and tokenizer.

roman-romanov-o · 2024-09-16T09:55:55Z

aidial_adapter_bedrock/llm/truncate_prompt.py

+    messages: List[_T],
+    tokenize_messages: Callable[[List[_T]], Awaitable[int]],
+    keep_message: Callable[[List[_T], int], bool],
+    partition_messages: Callable[[List[_T]], List[int]],


partitioner

#146 (comment)

aidial_adapter_bedrock/llm/truncate_prompt.py

aidial_adapter_bedrock/llm/chat_model.py

…aude3

…s DiscardedMessages type alias; got rid of Set[int] for the discarded messages

feat: added conservative tokenizer for Claude 3

476ae14

adubovik self-assigned this Sep 6, 2024

adubovik added 11 commits September 6, 2024 17:04

Merge branch 'development' into feat/support-prompt-truncation-for-cl…

b5eb9d1

…aude3

feat: add tool choice to Claude 3

ee40a84

feat: supported prompt truncation for claude 3

13d3dc9

feat: made claude 3 deployment typed literal

5b22aaa

feat: supported tokenize/ and truncate_prompt/ endpoints for Claude 3

43b4613

fix: Claude 3: switching to a byte counting tokenization approximation

04234cb

chore: updated README

b7dfdb8

feat: added test for Claude 3 truncate prompt

f6bce34

feat: added ClaudeRequest data class

3a6514d

feat: added test with a function

0523508

feat: added test with an image

8f9ea11

adubovik marked this pull request as ready for review September 10, 2024 10:48

adubovik requested a review from roman-romanov-o September 10, 2024 10:48

adubovik added 2 commits September 10, 2024 11:55

chore: fixed README

4cbbacf

Merge branch 'development' into feat/support-prompt-truncation-for-cl…

c680102

…aude3

feat: improved readability of tests with mock_tokenize_text

ca99288

adubovik force-pushed the feat/support-prompt-truncation-for-claude3 branch from 7b6944d to ca99288 Compare September 10, 2024 16:36

fix: simplified to_dial_tool_call method

5eb1ff0

chore: updated comment

2049084

roman-romanov-o reviewed Sep 16, 2024

View reviewed changes

adubovik added 2 commits September 20, 2024 12:14

Merge branch 'development' into feat/support-prompt-truncation-for-cl…

d4ca3e3

…aude3

fix: renamed truncate_prompt to compute_discarded_messages; introduce…

da2fcda

…s DiscardedMessages type alias; got rid of Set[int] for the discarded messages

adubovik force-pushed the feat/support-prompt-truncation-for-claude3 branch from fea64c5 to da2fcda Compare September 20, 2024 11:24

chore: renamings

86638a2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: added conservative tokenizer for Claude 3 #146

feat: added conservative tokenizer for Claude 3 #146

adubovik commented Sep 6, 2024 •

edited

Loading

adubovik commented Sep 10, 2024 •

edited by ai-dial-actions

Loading

adubovik commented Sep 10, 2024 •

edited by ai-dial-actions

Loading

adubovik commented Sep 12, 2024

adubovik commented Sep 16, 2024

roman-romanov-o Sep 16, 2024

adubovik Sep 16, 2024 •

edited

Loading

roman-romanov-o Sep 18, 2024

roman-romanov-o Sep 18, 2024

roman-romanov-o Sep 18, 2024

adubovik Sep 20, 2024

roman-romanov-o Sep 16, 2024

adubovik Sep 16, 2024

roman-romanov-o Sep 18, 2024 •

edited

Loading

adubovik Sep 20, 2024

roman-romanov-o Sep 16, 2024

adubovik Sep 16, 2024 •

edited

Loading

roman-romanov-o Sep 18, 2024

adubovik Sep 20, 2024

roman-romanov-o Sep 16, 2024

adubovik Sep 16, 2024

roman-romanov-o Sep 18, 2024

adubovik Sep 20, 2024

feat: added conservative tokenizer for Claude 3 #146

Are you sure you want to change the base?

feat: added conservative tokenizer for Claude 3 #146

Conversation

adubovik commented Sep 6, 2024 • edited Loading

adubovik commented Sep 10, 2024 • edited by ai-dial-actions Loading

adubovik commented Sep 10, 2024 • edited by ai-dial-actions Loading

adubovik commented Sep 12, 2024

adubovik commented Sep 16, 2024

Choose a reason for hiding this comment

adubovik Sep 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roman-romanov-o Sep 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adubovik Sep 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adubovik commented Sep 6, 2024 •

edited

Loading

adubovik commented Sep 10, 2024 •

edited by ai-dial-actions

Loading

adubovik commented Sep 10, 2024 •

edited by ai-dial-actions

Loading

adubovik Sep 16, 2024 •

edited

Loading

roman-romanov-o Sep 18, 2024 •

edited

Loading

adubovik Sep 16, 2024 •

edited

Loading