Problem: open-source maintainers spend a lot of time managing duplicate/related (doppelgänger) issues & pull requests
Solution: doppelgänger compares newly submitted issues/PRs against existing ones to automatically flag duplicate/related (doppelgänger) issues/PRs
Topics: vector db, github, open-source, embedding search, rag, similarity scores
Screen.Recording.2024-04-27.at.4.57.11.PM.mov
This application is a GitHub App that automatically compares newly opened issues with existing ones, closing and commenting on highly similar issues to reduce duplication.
- Python 3.7+
- A GitHub account
- A server or hosting platform to run the app (e.g., Heroku, DigitalOcean, AWS)
- Go to your GitHub account settings.
- Click on "Developer settings" in the left sidebar.
- Select "GitHub Apps" and click "New GitHub App".
- Fill in the required information:
- GitHub App name: Choose a unique name (e.g., "Issue Similarity Checker")
- Homepage URL: Your app's website or your GitHub profile
- Webhook URL: The URL where your server will be running (e.g., https://your-server.com/webhook)
- Webhook secret: Generate a secure secret and save it for later use
- Set permissions:
- Repository permissions:
- Issues: Read & write
- Pull requests: Read & Write
- Webhooks: Read-only
- Subscribe to events:
- Issues
- Pull request
- Repository permissions:
- Create the app and note down the App ID
- Generate a private key and download it (you'll need this later)
-
Clone this repository:
git clone https://github.com/dannyl1u/doppelganger.git cd doppelganger
-
Install dependencies:
pip install -r requirements.txt
-
Create a
.env
file in the project root with the following content:APP_ID=your_app_id_here WEBHOOK_SECRET=your_webhook_secret_here
-
Place the downloaded private key in the project root and name it
rsa.pem
.
- Choose a hosting platform (e.g., Heroku, DigitalOcean, AWS) and follow their deployment instructions.
- Make sure your server is accessible via HTTPS.
- Set the environment variables (
APP_ID
,WEBHOOK_SECRET
) on your hosting platform. - Upload the
rsa.pem
file to your server (ensure it's not publicly accessible).
- Go back to your GitHub App settings.
- Update the Webhook URL to point to your deployed application (e.g., https://your-server.com/webhook).
- Go to your GitHub App's settings page.
- In the "Install App" section, click "Install App" or "Add Installation".
- Choose the account where you want to install the app.
- Select the repository (or repositories) where you want to use the app.
- Confirm the installation.
Once installed, the app will automatically:
- Monitor new issues and PRs in the selected repositories.
- Compare new issues and PRs with existing ones using semantic similarity.
- Close and comment on highly similar issues and PRs to reduce duplication.
You can adjust the similarity threshold by modifying the SIMILARITY_THRESHOLD
variable in the script. The default is set to 0.5.
- Check the server logs for any error messages.
- Ensure all environment variables are correctly set.
- Verify that the
rsa.pem
file is present and correctly formatted.
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License.