Skip to content

Commit f5a403d

Browse files
author
wardli
committed
[Bug]: Amoro optimization can result in the input files and the merged output files having the same number of files, and this can cause the merge to fail and keep triggering the merge task. #3855
1 parent b24ad3c commit f5a403d

File tree

1 file changed

+7
-2
lines changed

1 file changed

+7
-2
lines changed

amoro-format-iceberg/src/main/java/org/apache/amoro/optimizing/plan/AbstractPartitionPlan.java

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -263,7 +263,9 @@ public Weight getWeight() {
263263
/**
264264
* When splitTask has only one undersized segment file, it needs to be triggered again to
265265
* determine whether to rewrite pos. If needed, add it to rewritePosDataFiles and bin-packing
266-
* together, else reserved delete files.
266+
* together. If it doesn't need to rewrite pos delete, add it to rewriteDataFiles so it can be
267+
* merged with fragment files or other files during the second bin-packing, which prevents file
268+
* loss and ensures proper file consolidation.
267269
*/
268270
protected void disposeUndersizedSegmentFile(SplitTask splitTask) {
269271
Optional<DataFile> dataFile = splitTask.getRewriteDataFiles().stream().findFirst();
@@ -273,7 +275,10 @@ protected void disposeUndersizedSegmentFile(SplitTask splitTask) {
273275
if (evaluator().segmentShouldRewritePos(rewriteDataFile, deletes)) {
274276
rewritePosDataFiles.put(rewriteDataFile, deletes);
275277
} else {
276-
reservedDeleteFiles(deletes);
278+
// Add to rewriteDataFiles so it can be merged with other files (fragment files, etc.)
279+
// during the second bin-packing. This prevents file loss and ensures files can be
280+
// consolidated when possible.
281+
rewriteDataFiles.put(rewriteDataFile, deletes);
277282
}
278283
}
279284
}

0 commit comments

Comments
 (0)