To-Do List¶
! P1
- AST, CST(PT) Syntax Trees (use Tree-sitter)
- DFG (dta flow graph)
- AST-T5, StructCoder, GraphCodeBERT, CodeT5, AST-Transformer, CodeT5+ (imp)
- masked span prediction (MSP) (sentinel tokens at decoder)
- BPE (byte pair enconding) tokenizer
- Momentum encoder
- Beam search (used in inference)
!! P2
- CodeBLEU metric
- ROUGE metric
- ELECTRA
- generator as adverserial using RL