Commit Graph

23 Commits

Author SHA1 Message Date
f8eb91fddb Add extra debugging 2023-12-31 10:32:32 +02:00
54db72fd89 Switch to SentencePiece for tokenisation and Roberta for the model 2023-12-30 15:24:16 +02:00
910e0c9d24 Convert to a multi-hot index in the CSV, to simplify our DataSets and DataLoaders 2023-12-30 12:30:43 +02:00
bedf82d8a1 Add original title to story text 2023-12-30 12:29:56 +02:00
61d32c5286 Cleanup, and device-aware training 2023-12-21 11:38:35 +02:00
c9a9e24619 Fix evaluation, as well as progress reporting. 2023-12-19 09:26:27 +02:00
94025fc0c6 Metadata 2023-12-13 20:29:55 +02:00
tim
a871c9235c v0.1.4 2023-12-13 20:28:30 +02:00
tim
8eb8d8b17c Update poetry config for Ubuntu LTS 2023-12-13 20:28:23 +02:00
tim
723c6d4378 First working model 2023-12-13 19:15:46 +02:00
tim
fe7870e9d4 Get model working (basically) 2023-12-13 11:59:42 +02:00
tim
5dd850d1cb Add reminder about old categories 2023-12-02 00:17:58 +02:00
tim
46f533746e Format for poetry and add debugging 2023-12-02 00:10:09 +02:00
tim
2039b017eb v0.1.2 2023-12-01 22:46:39 +02:00
tim
92f2e47a0f v0.1.1 2023-12-01 22:46:30 +02:00
tim
453f3ac9de Add dependencies 2023-12-01 22:18:23 +02:00
tim
12b95d6d3f Move to new location 2023-12-01 21:44:36 +02:00
tim
701c28353d Clean up some minor issues (like iterating over the DataSet) & simplify 2023-12-01 21:38:00 +02:00
tim
235c58f3c5 Add possible split between training and validation data 2023-12-01 21:37:59 +02:00
tim
da6f0142e0 First pass at imbibing a CSV of data and turning it into a dataset, and thence into a dataloader 2023-12-01 21:37:59 +02:00
tim
f60aeb0afe Convert a bunch of XML files into a CSV dataset 2023-12-01 21:37:59 +02:00
tim
fcee47be08 v0.1.1 2023-12-01 21:27:25 +02:00
tim
8dc64f113a v0.1.0 2023-12-01 21:24:42 +02:00