Skip to content
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
7b5505c
fixes #12810: Implement escaping for keyword separators (FKA #12888)
miguel-cordoba Jul 24, 2025
44257ad
fixes #12810: Implement escaping for keyword separators (FKA #12888)
miguel-cordoba Jul 24, 2025
e6c52dd
fixes #12810: Implement escaping for keyword separators (FKA #12888)
miguel-cordoba Jul 24, 2025
d2ad73d
removes obvious comment, improves CHANGELOG message
miguel-cordoba Jul 24, 2025
465955a
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 24, 2025
4928536
removes obvious comments
miguel-cordoba Jul 24, 2025
2ccc24e
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Jul 24, 2025
1111794
tackle List.of() review comment
miguel-cordoba Jul 24, 2025
83198c0
undo tackle List.of() review comment
miguel-cordoba Jul 24, 2025
cc5fa8b
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 25, 2025
87d1ff6
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 25, 2025
3d2785b
Merge branch 'main' into fix-for-issue-12810
koppor Jul 28, 2025
1b455d3
adds tests after review
miguel-cordoba Jul 29, 2025
2b85d0c
adds trag-bot changes
miguel-cordoba Jul 29, 2025
4b31cd0
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 29, 2025
d15f2ef
- extends Keyword#toString to ensure round-trip integrity
miguel-cordoba Jul 30, 2025
d77bb5e
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Jul 30, 2025
02c252d
- improves Changelog message
miguel-cordoba Jul 30, 2025
027dde0
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 30, 2025
ea690d6
- removes autoscaping on keyword#toString -> KeywordListTest#roundTri…
miguel-cordoba Jul 30, 2025
b6f728f
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Jul 30, 2025
a72b614
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 31, 2025
4c5181b
- removes autoscaping on keyword#toString -> KeywordListTest#roundTri…
miguel-cordoba Aug 3, 2025
b02aeea
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 6, 2025
eef54b6
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Aug 6, 2025
ffeb78d
Working on new approach for parse/serielide depending on context (UI/…
miguel-cordoba Aug 6, 2025
1f321c2
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 10, 2025
3e2636f
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 13, 2025
634d302
Working on new approach for parse/serielide depending on context (UI/…
miguel-cordoba Aug 13, 2025
b2d902f
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Aug 13, 2025
42710fa
WIP: adds old KeyWordList#parse as #bibtexParse
miguel-cordoba Aug 13, 2025
4b2a8d8
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 13, 2025
a2031ed
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 24, 2025
af3c50e
WIP: adds KeywordList#bibtexSerialize (autoescapes delimiter). Needs …
miguel-cordoba Aug 24, 2025
e955fd1
WIP: extends KeywordList#parse and #bibtexSerialize
miguel-cordoba Aug 31, 2025
a3d69f2
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv

### Added

- We implemented escaping characters `\ ,` when parsing KeywordLists [#12810](https://github.com/JabRef/jabref/issues/12810)
- We added a field for the citation count field on the General tab. [#13477](https://github.com/JabRef/jabref/issues/13477)
- We added automatic lookup of DOI at citation relations [#13234](https://github.com/JabRef/jabref/issues/13234)
- We added focus on the field Link in the "Add file link" dialog. [#13486](https://github.com/JabRef/jabref/issues/13486)
Expand Down
37 changes: 31 additions & 6 deletions jablib/src/main/java/org/jabref/model/entry/KeywordList.java
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
import java.util.List;
import java.util.Objects;
import java.util.Set;
import java.util.StringTokenizer;
import java.util.stream.Collectors;
import java.util.stream.Stream;

Expand Down Expand Up @@ -52,13 +51,39 @@ public static KeywordList parse(String keywordString, Character delimiter, Chara
Objects.requireNonNull(hierarchicalDelimiter);

KeywordList keywordList = new KeywordList();
List<String> hierarchy = new ArrayList<>();
StringBuilder currentToken = new StringBuilder();
boolean isEscaping = false;

for (int i = 0; i < keywordString.length(); i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this is #12888 (comment) :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review! As I dived into the project I saw that this loop had not been addressed. So I used @ungerts proposed comment, since it is readable and solves the issue. Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very OK. I like links to provide context.

Other reviewers might think: why a for loop etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh ok perfect :)

char currentChar = keywordString.charAt(i);

if (isEscaping) {
currentToken.append(currentChar);
isEscaping = false;
} else if (currentChar == '\\') {
isEscaping = true;
} else if (currentChar == hierarchicalDelimiter) {
// Hierarchical delimiter reached: push current token as sublevel
hierarchy.add(currentToken.toString().trim());
currentToken.setLength(0);
} else if (currentChar == delimiter) {
// Keyword delimiter reached: push current token and then flush full hierarchy as keyword
hierarchy.add(currentToken.toString().trim());
currentToken.setLength(0);
keywordList.add(Keyword.of(hierarchy.toArray(new String[0])));
hierarchy.clear();
} else {
currentToken.append(currentChar);
}
}

StringTokenizer tok = new StringTokenizer(keywordString, delimiter.toString());
while (tok.hasMoreTokens()) {
String chain = tok.nextToken();
Keyword chainRoot = Keyword.of(chain.split(hierarchicalDelimiter.toString()));
keywordList.add(chainRoot);
// Handle the final token
if (!currentToken.isEmpty() || !hierarchy.isEmpty()) {
hierarchy.add(currentToken.toString().trim());
keywordList.add(Keyword.of(hierarchy.toArray(new String[0])));
}

return keywordList;
}

Expand Down
37 changes: 37 additions & 0 deletions jablib/src/test/java/org/jabref/model/entry/KeywordListTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -115,4 +115,41 @@ void mergeTwoDistinctKeywordsShouldReturnTheTwoKeywordsMerged() {
void mergeTwoListsOfKeywordsShouldReturnTheKeywordsMerged() {
assertEquals(new KeywordList("Figma", "Adobe", "JabRef", "Eclipse", "JetBrains"), KeywordList.merge("Figma, Adobe, JetBrains, Eclipse", "Adobe, JabRef", ','));
}

@Test
void parseKeywordWithEscapedDelimiterDoesNotSplitKeyword() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convert to ParameterizedTest

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gathered all the previous tests in a Parameterized one, I also added roundtrip tests on KeyworList and Keyword. Thank you so much for your review!

assertEquals(new KeywordList("keyword,one", "keywordTwo"),
KeywordList.parse("keyword\\,one, keywordTwo", ',', '>'));
}

@Test
void parseKeywordWithEscapedDelimiterAtEndPreservesDelimiter() {
assertEquals(new KeywordList("keywordOne,", "keywordTwo"),
KeywordList.parse("keywordOne\\,, keywordTwo", ',', '>'));
}

@Test
void parseKeywordWithEscapedBackslashTreatsItAsLiteral() {
assertEquals(new KeywordList("keyword\\", "keywordTwo"),
KeywordList.parse("keyword\\\\, keywordTwo", ',', '>'));
}

@Test
void parseKeywordWithEscapedDelimiterAndHierarchicalDelimiterCreatesHierarchy() {
Keyword expected = Keyword.of("keyword,one", "sub");
assertEquals(new KeywordList(expected),
KeywordList.parse("keyword\\,one > sub", ',', '>'));
}

@Test
void parseKeywordWithMultipleEscapedDelimitersTreatsThemAsLiteral() {
assertEquals(new KeywordList("one,two,three", "four"),
KeywordList.parse("one\\,two\\,three, four", ',', '>'));
}

@Test
void parseKeywordWithTrailingEscapeCharacterTreatsItAsLiteralBackslash() {
assertEquals(new KeywordList("keywordOne\\"),
KeywordList.parse("keywordOne\\\\", ',', '>'));
}
}