Skip to content

Commit 3451016

Browse files
authored
Merge pull request #209 from jecisc/speedup-data-encoding
Speedup data preprocessing a lot
2 parents 8881637 + d3766b7 commit 3451016

File tree

1 file changed

+12
-0
lines changed

1 file changed

+12
-0
lines changed

src/DataFrame/DataFrame.class.st

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -852,6 +852,18 @@ DataFrame >> crossTabulate: colName1 with: colName2 [
852852
^ col1 crossTabulateWith: col2
853853
]
854854

855+
{ #category : #copying }
856+
DataFrame >> dataPreProcessingEncodeWith: anEncoder [
857+
"This method is here to speed up pharo-ai/data-preprocessing algos without coupling both projects."
858+
859+
| copy |
860+
copy := self copy.
861+
self columns doWithIndex: [ :dataSerie :columnIndex |
862+
dataSerie doWithIndex: [ :element :rowIndex | copy at: rowIndex at: columnIndex put: ((anEncoder categories at: columnIndex) indexOf: element) ] ].
863+
864+
^ copy
865+
]
866+
855867
{ #category : #'data-types' }
856868
DataFrame >> dataTypeOfColumn: aColumnName [
857869
"Given a column name of the DataFrame, it returns the data type of that column"

0 commit comments

Comments
 (0)