Touchdown | Map2seq | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Dev Set | Test Set | Dev Set | Test Set | |||||||||
Model | TC↑ | SPD↓ | nDTW↑ | TC↑ | SPD↓ | nDTW↑ | TC↑ | SPD↓ | nDTW↑ | TC↑ | SPD↓ | nDTW↑ |
RCONCAT (2019) | 10.60 | 20.4 | 22.50 | 11.80 | 20.40 | 22.90 | 17.10 | - | 30.70 | 14.70 | - | 27.70 |
GA (2019) | 12.00 | 18.70 | 25.20 | 11.9 | 19.00 | 24.90 | 18.20 | - | 33.00 | 17.00 | - | 30.10 |
VLN-Trans (2021) | 15.00 | 20.30 | 27.00 | 16.20 | 20.80 | 27.80 | 18.60 | - | 31.10 | 17.00 | - | 29.50 |
ARC+L2S (2020) | 19.48 | 17.05 | - | 16.68 | 18.84 | - | - | - | - | - | - | - |
ORAR (2022) | 30.05 | 11.12 | 45.50 | 29.60 | 11.79 | 45.30 | 49.88 | 5.87 | 62.70 | 47.75 | 6.53 | 62.10 |
VELMA (2023) | 29.83 | 14.67 | 43.44 | 27.38 | 15.03 | 41.93 | 52.75 | 6.78 | 66.45 | 48.70 | 6.80 | 62.37 |
PM-VLN (2023) | 33.00 | 23.60 | - | 33.40 | 23.80 | - | - | - | - | - | - | - |
VLN-Video (2024) | 34.50 | 9.60 | - | 31.70 | 11.2 | - | - | - | - | - | - | - |
Loc4Plan (2024) | 34.50 | 10.50 | - | 32.90 | 11.50 | - | 48.00 | 7.00 | - | 45.30 | 7.20 | - |
FLAME | 41.28 | 9.14 | 55.96 | 40.20 | 9.53 | 54.56 | 56.95 | 5.95 | 71.36 | 52.44 | 5.91 | 67.72 |
Comparison with state-of-the-art models on Touchdown and Map2seq datasets. Bold values indicate best performance.