Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow unordered indexes in staged area when sort_and_finalize_staged_data is used #1738

Open
Tracked by #1679
vasil-pashov opened this issue Aug 1, 2024 · 0 comments · May be fixed by #1799
Open
Tracked by #1679

Allow unordered indexes in staged area when sort_and_finalize_staged_data is used #1738

vasil-pashov opened this issue Aug 1, 2024 · 0 comments · May be fixed by #1799
Assignees
Labels
bug Something isn't working

Comments

@vasil-pashov
Copy link
Collaborator

vasil-pashov commented Aug 1, 2024

Is your feature request related to a problem? Please describe.
Currently sort_and_finalize_staged_data the indexes in all segments to be sorted. Or an exception is thrown.

import pandas as pd
import numpy as np
import arcticdb as adb

ac = adb.Arctic("lmdb://test")
lib = ac.get_library("test", create_if_missing=True)
dates = [np.datetime64('2023-01-03'), np.datetime64('2023-01-01'), np.datetime64('2023-01-05')]
df = pd.DataFrame({"col": [2, 1, 3]}, index=dates)
lib.write("sym", df, staged=True)
lib.sort_and_finalize_staged_data("sym")

Output:

Traceback (most recent call last):
  File "...\test.py", line 9, in <module>
    lib.write("sym", df, staged=True)
  File "...\arcticdb\version_store\library.py", line 461, in write
    return self._nvs.write(
  File "...\arcticdb\version_store\_store.py", line 583, in write
    self.version_store.write_parallel(symbol, item, norm_meta, udm)
arcticdb_ext.exceptions.UnsortedDataException: E_UNSORTED_DATA When writing/appending staged data in parallel, input data must be sorted.

Describe the solution you'd like
Allow unordered indexes in staged segments and sort then when sort_and_finalize_staged_data is called.

@vasil-pashov vasil-pashov changed the title sort_and_finalize does not work when the index is not sorted per segment. Should it? Allow unordered indexes in staged area when sort_and_finalize_staged_data is used Aug 1, 2024
@vasil-pashov vasil-pashov added the enhancement New feature or request label Aug 1, 2024
@alexowens90 alexowens90 added bug Something isn't working and removed enhancement New feature or request labels Aug 16, 2024
@vasil-pashov vasil-pashov linked a pull request Aug 30, 2024 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants