Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] No subclasses refactor #435

Closed
wants to merge 6 commits into from
Closed

[WIP] No subclasses refactor #435

wants to merge 6 commits into from

Conversation

cgarciae
Copy link
Contributor

@cgarciae cgarciae commented Nov 12, 2021

The idea of this PR is to:

  1. Remove the classes LuxDataFrame, LuxSeries, etc
  2. Entirely avoid strong modifications to the pandas library such as overriding pandas.DataFrame class (and all other internal references to it) with their Lux counterpart e.g.:
 pandas.DataFrame = LuxDataFrame

Current behaviour can be surprising to the user and it makes supporting new versions harder as you might have to deal with more edge cases because the impact of the modifications is so large. Instead the idea is to some lightweight monkey patching of fiew core methods (which mostly do bookkeeping) and move the rest to out of the DataFrame class to avoid possible conflicts. A lot of discussion and possible iteration is needed. These are the current efforts:

Ongoing changes

  • LuxDataFrame as a class is removed, instead key methods of interest form pandas.DataFrame are patched.
  • The type LuxDataFrame is now a Protocol that is there only for possible validation and type inference purposes.
  • Core methods are patched using a new @patch(cls) decorator, this replaces the method on the class but the original method is still available as ._super_{method_name}, e.g:
@patch(DataFrame)
def _set_axis(self, axis, labels):
    self._super_set_axis(axis, labels)
    ...
  • To reduce the surface area in which Lux impacts Pandas, a single .lux property that includes all lux extension methods is created. Its very similar to the existing .str and .dt functionality already present in pandas, here is an example usage:
df.lux.save_as_html(...)

@cgarciae
Copy link
Contributor Author

cgarciae commented Nov 22, 2021

@dorisjlee Current status:

  • I experimented using groupby in "mini_lux" and it propagated the information out of the box 🎉. This means we can probably drop groupby.py.
  • I finished porting frame.py for now but had to remove many references to LuxDataFrame in many of the other modules to avoid circular dependencies. I think that we can solve this later by defining LuxDataFrame in a separate module.

@dorisjlee
Copy link
Member

Thanks for the update, that sounds great @cgarciae! Glad to hear that we don't need to patch up GroupbyObjects explicitly to get the propagation to work!

Let me know when you want me to take a look at this patch, happy to go through and play around with this if it would be helpful!

@cgarciae
Copy link
Contributor Author

Closed in favor of #437

@cgarciae cgarciae closed this Nov 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants