Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance xGitGuard Scanner with BERT Model for Advanced Secret Detection #34

Open
radhi1991 opened this issue Jun 25, 2024 · 0 comments
Open
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@radhi1991
Copy link
Collaborator

Details:
Transformer-based models are better for this problem as they capture the context around lines of code. In general, random forest models do not perform well on high-dimensional data. For sequential data, proposed transformer models work better than existing models, which are better suited for non-sequential data.

The solution:
We propose to enhance the xGitGuard scanner by integrating a BERT model specifically trained for secret detection.

The steps include:

  1. Training and building models using BERT:
    Develop machine learning models focused on secret detection using BERT architecture.

  2. Integrating BERT into scanners:
    Seamlessly integrate the trained BERT model into the xGitGuard scanner, enhancing its ability to detect sensitive information with higher accuracy.

Alternatives:
Any other pre-trained models like PaLM, Gemini, or any GPT models.

Additional context:
Requires considerable training data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant