-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Upgrade upgrading guide #278
base: master
Are you sure you want to change the base?
Conversation
@@ -28,6 +31,7 @@ Attributes suffixed with `_millis` were renamed to remove said suffix and have t | |||
- `Actor.start`, `Actor.call`, `Actor.start_task`, `Actor.set_status_message` and `Actor.abort` return instances of the `ActorRun` model instead of an untyped `dict`. | |||
- Upon entering the context manager (`async with Actor`), the `Actor` puts the default logging configuration in place. This can be disabled using the `configure_logging` parameter. | |||
- The `config` parameter of `Actor` has been renamed to `configuration`. | |||
- Event handlers registered via `Actor.on` will now receive Pydantic objects instead of untyped dicts. For example, where you would do `event['isMigrating']`, you should now use `event.is_migrating` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intentionally vague - maybe we should expose the event models somehow so that we can link them from here. Currently, they are internal members of Crawlee.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just 2 comments
- The SDK now uses [crawlee](https://github.com/apify/crawlee-python) for local storage emulation. This change should not affect intended usage (working with `Dataset`, `KeyValueStore` and `RequestQueue` classes from the `apify.storages` module or using the shortcuts exposed by the `Actor` class) in any way. | ||
- There is a difference in the `RequestQueue.add_request` method: it accepts an `apify.Request` object instead of a free-form dictionary. | ||
- A quick way to migrate from dict-based arguments is to wrap it with a `Request.model_validate()` call. | ||
- The preferred way is to instantiate it directly, e.g., `Request(url='https://example.tld', ...)`, or using the `Request.from_url` helper which prefills the `unique_key` and `id` attributes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should mention the from_url as a preferred way of creating new Requests. The need for instantiating it directly should be quite rare.
|
||
Removing the `StorageClientManager` class is a significant change. If you need to change the storage client, use `crawlee.service_container` instead. | ||
- The SDK now uses [crawlee](https://github.com/apify/crawlee-python) for local storage emulation. This change should not affect intended usage (working with `Dataset`, `KeyValueStore` and `RequestQueue` classes from the `apify.storages` module or using the shortcuts exposed by the `Actor` class) in any way. | ||
- There is a difference in the `RequestQueue.add_request` method: it accepts an `apify.Request` object instead of a free-form dictionary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it only RQ.add_request
? Do no more methods work with requests? We should also mention that users can provide just a URL as a string or apify.Request
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the bullets below this one not satisfy this need?
No description provided.