Skip to content
Closed
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
7b5505c
fixes #12810: Implement escaping for keyword separators (FKA #12888)
miguel-cordoba Jul 24, 2025
44257ad
fixes #12810: Implement escaping for keyword separators (FKA #12888)
miguel-cordoba Jul 24, 2025
e6c52dd
fixes #12810: Implement escaping for keyword separators (FKA #12888)
miguel-cordoba Jul 24, 2025
d2ad73d
removes obvious comment, improves CHANGELOG message
miguel-cordoba Jul 24, 2025
465955a
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 24, 2025
4928536
removes obvious comments
miguel-cordoba Jul 24, 2025
2ccc24e
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Jul 24, 2025
1111794
tackle List.of() review comment
miguel-cordoba Jul 24, 2025
83198c0
undo tackle List.of() review comment
miguel-cordoba Jul 24, 2025
cc5fa8b
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 25, 2025
87d1ff6
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 25, 2025
3d2785b
Merge branch 'main' into fix-for-issue-12810
koppor Jul 28, 2025
1b455d3
adds tests after review
miguel-cordoba Jul 29, 2025
2b85d0c
adds trag-bot changes
miguel-cordoba Jul 29, 2025
4b31cd0
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 29, 2025
d15f2ef
- extends Keyword#toString to ensure round-trip integrity
miguel-cordoba Jul 30, 2025
d77bb5e
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Jul 30, 2025
02c252d
- improves Changelog message
miguel-cordoba Jul 30, 2025
027dde0
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 30, 2025
ea690d6
- removes autoscaping on keyword#toString -> KeywordListTest#roundTri…
miguel-cordoba Jul 30, 2025
b6f728f
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Jul 30, 2025
a72b614
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Jul 31, 2025
4c5181b
- removes autoscaping on keyword#toString -> KeywordListTest#roundTri…
miguel-cordoba Aug 3, 2025
b02aeea
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 6, 2025
eef54b6
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Aug 6, 2025
ffeb78d
Working on new approach for parse/serielide depending on context (UI/…
miguel-cordoba Aug 6, 2025
1f321c2
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 10, 2025
3e2636f
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 13, 2025
634d302
Working on new approach for parse/serielide depending on context (UI/…
miguel-cordoba Aug 13, 2025
b2d902f
Merge remote-tracking branch 'origin/fix-for-issue-12810' into fix-fo…
miguel-cordoba Aug 13, 2025
42710fa
WIP: adds old KeyWordList#parse as #bibtexParse
miguel-cordoba Aug 13, 2025
4b2a8d8
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 13, 2025
a2031ed
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 24, 2025
af3c50e
WIP: adds KeywordList#bibtexSerialize (autoescapes delimiter). Needs …
miguel-cordoba Aug 24, 2025
e955fd1
WIP: extends KeywordList#parse and #bibtexSerialize
miguel-cordoba Aug 31, 2025
a3d69f2
Merge branch 'main' into fix-for-issue-12810
miguel-cordoba Aug 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv

### Added

- We implemented escaping characters `\ ,` when parsing KeywordLists [#12810](https://github.com/JabRef/jabref/issues/12810)
- We added automatic lookup of DOI at citation information. [#13561](https://github.com/JabRef/jabref/issues/13561)
- We added a field for the citation count field on the General tab. [#13477](https://github.com/JabRef/jabref/issues/13477)
- We added automatic lookup of DOI at citation relations [#13234](https://github.com/JabRef/jabref/issues/13234)
Expand Down
35 changes: 29 additions & 6 deletions jablib/src/main/java/org/jabref/model/entry/KeywordList.java
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
import java.util.List;
import java.util.Objects;
import java.util.Set;
import java.util.StringTokenizer;
import java.util.stream.Collectors;
import java.util.stream.Stream;

Expand Down Expand Up @@ -52,13 +51,37 @@ public static KeywordList parse(String keywordString, Character delimiter, Chara
Objects.requireNonNull(hierarchicalDelimiter);

KeywordList keywordList = new KeywordList();
List<String> hierarchy = new ArrayList<>();
StringBuilder currentToken = new StringBuilder();
boolean isEscaping = false;

for (int i = 0; i < keywordString.length(); i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this is #12888 (comment) :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review! As I dived into the project I saw that this loop had not been addressed. So I used @ungerts proposed comment, since it is readable and solves the issue. Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very OK. I like links to provide context.

Other reviewers might think: why a for loop etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh ok perfect :)

char currentChar = keywordString.charAt(i);

if (isEscaping) {
currentToken.append(currentChar);
isEscaping = false;
} else if (currentChar == '\\') {
isEscaping = true;
} else if (currentChar == hierarchicalDelimiter) {
hierarchy.add(currentToken.toString().trim());
currentToken.setLength(0);
} else if (currentChar == delimiter) {
hierarchy.add(currentToken.toString().trim());
currentToken.setLength(0);
keywordList.add(Keyword.of(hierarchy.toArray(new String[0])));
hierarchy.clear();
} else {
currentToken.append(currentChar);
}
}

StringTokenizer tok = new StringTokenizer(keywordString, delimiter.toString());
while (tok.hasMoreTokens()) {
String chain = tok.nextToken();
Keyword chainRoot = Keyword.of(chain.split(hierarchicalDelimiter.toString()));
keywordList.add(chainRoot);
// Handle the final token
if (!currentToken.isEmpty() || !hierarchy.isEmpty()) {
hierarchy.add(currentToken.toString().trim());
keywordList.add(Keyword.of(hierarchy.toArray(new String[0])));
}

return keywordList;
}

Expand Down
36 changes: 36 additions & 0 deletions jablib/src/test/java/org/jabref/model/entry/KeywordListTest.java
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
package org.jabref.model.entry;

import java.util.stream.Stream;

import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.Arguments;
import org.junit.jupiter.params.provider.MethodSource;
import org.junit.jupiter.params.provider.ValueSource;

import static org.junit.jupiter.api.Assertions.assertEquals;

Expand All @@ -16,6 +22,17 @@ void setUp() {
keywords.add("keywordTwo");
}

private static Stream<Arguments> provideParseKeywordCases() {
return Stream.of(
Arguments.of("keyword\\,one, keywordTwo", new KeywordList("keyword,one", "keywordTwo")),
Arguments.of("keywordOne\\,, keywordTwo", new KeywordList("keywordOne,", "keywordTwo")),
Arguments.of("keyword\\\\, keywordTwo", new KeywordList("keyword\\", "keywordTwo")),
Arguments.of("keyword\\,one > sub", new KeywordList(Keyword.of("keyword,one", "sub"))),
Arguments.of("one\\,two\\,three, four", new KeywordList("one,two,three", "four")),
Arguments.of("keywordOne\\\\", new KeywordList("keywordOne\\"))
);
}

@Test
void parseEmptyStringReturnsEmptyList() {
assertEquals(new KeywordList(), KeywordList.parse("", ','));
Expand Down Expand Up @@ -115,4 +132,23 @@ void mergeTwoDistinctKeywordsShouldReturnTheTwoKeywordsMerged() {
void mergeTwoListsOfKeywordsShouldReturnTheKeywordsMerged() {
assertEquals(new KeywordList("Figma", "Adobe", "JabRef", "Eclipse", "JetBrains"), KeywordList.merge("Figma, Adobe, JetBrains, Eclipse", "Adobe, JabRef", ','));
}

@ParameterizedTest
@MethodSource("provideParseKeywordCases")
void parseKeywordWithEscapedDelimiterDoesNotSplitKeyword(String input, KeywordList expected) {
assertEquals(expected, KeywordList.parse(input, ',', '>'));
}

@ParameterizedTest
@ValueSource(strings = {
"Keyword > Keyword",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test all of provideParseKeywordCases

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When extending this test it was revealed that the current implementation does not handle escaping as desired at deserializing, which made me extend the logic on Keyword#getSubchainAsString

"Keyword \\> Keyword"
})
void roundTripPreservesStructure(String original) {
KeywordList parsed = KeywordList.parse(original, ',', '>');
String serialized = parsed.toString();
KeywordList reparsed = KeywordList.parse(serialized, ',', '>');

assertEquals(parsed.toString(), reparsed.toString());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing string representations instead of actual objects may hide structural differences. Should directly compare KeywordList objects to ensure complete equality.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesnt apply here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment above the statement to indicate that the toSTring() fucntionality is tested.

Morevoer, test for

Suggested change
assertEquals(parsed.toString(), reparsed.toString());
assertEquals(original, parsed.toString());

No need for reparsed.

}
}
13 changes: 13 additions & 0 deletions jablib/src/test/java/org/jabref/model/entry/KeywordTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
import java.util.Set;

import org.junit.jupiter.api.Test;
import org.junit.jupiter.params.ParameterizedTest;
import org.junit.jupiter.params.provider.ValueSource;

import static org.junit.jupiter.api.Assertions.assertEquals;

Expand All @@ -25,4 +27,15 @@ void getAllSubchainsAsStringForSimpleChain() {

assertEquals(expected, keywordChain.getAllSubchainsAsString('>'));
}

@ParameterizedTest
@ValueSource(strings = {
"Keyword > Keyword",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would excpect all cases from provideParseKeywordCases there.

"Keyword \\> Keyword"
})
void getSubchainAsString(String input) {
Keyword keyword = new Keyword(input);
String result = keyword.toString();
assertEquals(input, result);
}
}