Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIKA-4272 add pf4j to tika pipes, decouple Fetcher with Fetcher Config, stop using shade plugin for fetchers #1906

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

nddipiazza
Copy link
Contributor

@nddipiazza nddipiazza commented Aug 18, 2024

Tika pipes in production in that the extensible Fetcher objects we have loaded into the Tika Server and Tika Grpc Server would have classpath loading issues with other Fetchers. They need to be purely classpath independent of each other.

In order to fix this, I am attempting to introduce pf4j in this pull.

In this pull, the shade plugin goes completely bye-bye in favor of Maven dependency plugin and assembly plugin.

All Fetchers are now loaded via the plugin manager and classpath pulled in dynamically with a separate classloader than those of other Fetchers.

Some changes come as a result:

So now instead of having in the tika configuration. It's actually because we don't need a full copy of the Fetcher anymore.

So now the fetcherConfig is the only thing stored in the Tika Config and the pf4j plugin manager handles loading the correct Fetcher, and then you send it the configuration that it requires.

So now I'm going into the Tika xml serialization stuff I need to place the FetcherConfig to replace the Fetcher objects previously stored there.

@nddipiazza nddipiazza marked this pull request as draft August 18, 2024 15:49
@nddipiazza nddipiazza changed the title Tika 4272 docker TIKA-4272 add pf4j to tika pipes, decouple Fetcher with Fetcher Config Aug 24, 2024
@nddipiazza nddipiazza changed the title TIKA-4272 add pf4j to tika pipes, decouple Fetcher with Fetcher Config TIKA-4272 add pf4j to tika pipes, decouple Fetcher with Fetcher Config, stop using shade plugin for fetchers Aug 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant