Service to delete duplicate items #15942
Replies: 6 comments 5 replies
-
|
✨ Thank you for your code contribution proposal! While the Bitwarden team reviews your submission, we encourage you to check out our contribution guidelines. Please ensure that your code contribution includes a detailed description of what you would like to contribute, along with any relevant screenshots and links to existing feature requests. This information helps us gather feedback from the community and Bitwarden team members before you start writing code. To keep discussions focused, posts that do not include a proposal for a code contribution will be removed.
Thank you for contributing to Bitwarden! |
Beta Was this translation helpful? Give feedback.
-
|
Please see PR 15967 opened for this feature integration. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
[PM-24630] feature: de-duplication tool with test coverage PR #15967Duplicate Detection LogicCurrently only login ciphers are fully supported. Two or more ciphers are considered duplicate if any of the following are true:
Important clarification and consideration
Multiple shared URI exampleConsider the following login ciphers:
The example above is instructive to better understand how URI normalization has been used in this feature: URIs are normalized such that only subdomain, domain, and top-level domain are retained ( see hostname ). So the above set of ciphers (5 URIs total, 3 of which are unique), only yeilds 2 normalized ciphers. The screenshot below demonstrates that the duplicate review dialog presents "sets" of duplicate logins, which correspond to the buckets described earlier. Since the username+uri combinations resulting from URI normalization yielded 2 buckets, two sets will be presented to the user. Because 3 ciphers were present in these 2 sets, the user is informed that 3 duplicates were found.
Tool WorkflowThe initial use of this tool moves ciphers selected for deletion to the trash. From there they can be easily recovered. The user can decide to empty their vault trash, or else run the tool a second time to permanently delete the ciphers. Progress spinner inside button, similar to reporting tools: If a user decides to cancel the operation: The initial pass will detect duplicates (which in the example below are not currently in the trash, but could be): First time duplicate deletion moves items to trash: Duplicate Review Dialog (which in this example shows logins moved to trash in previous 'pass'): Second 'pass' over duplicates will delete permanently: If no duplicates are found (migrated from toast to callout): Question and answer regarding the workflow
Unselecting 1_TEST in one grouping will unselect it from all groupings, and vice-versa. If a cipher is selected for deletion or retention in any grouping, that choice will extend to all groupings. Future Efforts🌱 The Duplicate Review Dialog displays rows much wider than the dialog box itself, and so considerable horizontal scrolling is necessary Footnotes
|
Beta Was this translation helpful? Give feedback.
-
Discussion on default URI matching for duplicate detectionShould default duplicate URI matching behavior differ from default autofill matching?The comment above describes some changes I made to PR #15967 in order to integrate the same URI matching logic that is currently used in the web extension autofill functionality. Currently, I've set the default URI match to use Base matching, which is the same default used when matching URIs for autofill. This may not be ideal, because the base match strategy is eager. Assuming no permissive "Starts with" or permissive regular expressions (both are discouraged by Bitwarden), the "Base" matching strategy is most likely to match generously because it doesn't consider subdomains, ports, paths, etc. and is only concerned with top and second-level domains. In the context of autofilling credentials, this may lead to slightly annoying behavior if different credentials are used across subdomains. In the context of credential deletion, this may lead to unintended deletions, a destructive action. The image below demonstrates how the "Base" strategy identified duplicated logins which would not have been identified using "Hostname" or "Host," which may be more suitable defaults for this feature.
|
Beta Was this translation helpful? Give feedback.
-
|
Please also see bitwarden/sdk-internal#418 |
Beta Was this translation helpful? Give feedback.












Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
✅ Code Contribution Proposal
Community identified issue
A community forum post from 2018 with considerable engagement describes a scenario where using the import tool with different sources can result in a large quantity of duplicates. Community members are requesting a tool that will allow for de-duplication.
👀 Jump to latest update
My experience
Please also see PR 15967 that demonstrates how this is working and current proposed code changes (the code shown on this discussion has changed).
First, I imported credentials into Bitwarden from my preferred browser. Next, I imported credentials from a secondary browser, which had nearly all of the same login items. This resulted in the size of my vault nearly doubling. Upon inspecting the duplicated credentials, I noticed:
The original credential had the following structure:
The duplicate credential (created when performing the second import) had the following structure:
Where username and password values are shared between the duplicates.
I have not yet identified how Bitwarden will handle duplicates beyond this level (n > 2) since it appears Item names must be unique. Before doing so, I wanted to share my initial approach at implementing a de-duplication service.Initial strategy for removing duplicate credentials
My current approach relies on the username being appended to the fully qualified domain name in order to generate the Item name during import. This is not a robust solution, but will allow for initial testing in my specific situation.git checkout -b feature/duplicate-deletionAdded
apps/web/src/app/vault/services/duplicate-cipher.service.ts:Beta Was this translation helpful? Give feedback.
All reactions