stanford-crfm/helm: v0.5.2 ...
Scenarios Updated VHELM scenarios for VLMs (#2719, #2684, #2685, #2641, #2691) Updated Image2Struct scenarios (#2608, #2640, #2660, #2661) Added Automatic GPT4V Evaluation for VLM Originality Evaluation Added FinQA scenario (#2588) Added AIR-Bench 2024 (#2698, #2706, #2710, #2712, #2713) Fixed entit...
Main Authors: | , , , , , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Software |
Language: | unknown |
Published: |
Zenodo
2024
|
Subjects: | |
Online Access: | https://dx.doi.org/10.5281/zenodo.12018094 https://zenodo.org/doi/10.5281/zenodo.12018094 |
Summary: | Scenarios Updated VHELM scenarios for VLMs (#2719, #2684, #2685, #2641, #2691) Updated Image2Struct scenarios (#2608, #2640, #2660, #2661) Added Automatic GPT4V Evaluation for VLM Originality Evaluation Added FinQA scenario (#2588) Added AIR-Bench 2024 (#2698, #2706, #2710, #2712, #2713) Fixed entity_data_imputation scenario breakage by mirroring source data files (#2750) Models Added google-cloud-aiplatform~=1.48 dependency requirement for Vertex AI client (#2628) Fixed bug with Vertex AI client error handling (#2614) Fixed bug with for Arctic tokenizer (#2615) Added Qwen1.5 110B Chat (#2621) Added TogetherCompletionClient (#2629) Fixed bugs with Yi Chat and Llama 3 Chat on Together (#2636) Added Optimum Intel (#2609, #2674) Added GPT-4o model (#2649, #2656) Added SEA-LION 7B and SEA-LION 7B Instruct (#2647) Added more Gemini 1.5 Flash and Pro versions (#2653, #2664, #2718, #2718) Added Gemini 1.0 Pro 002 (#2664) Added Command R and Command R+ models (#2548) Fixed GPT4V Evaluator Out of Option Range Issue ... |
---|