https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#head https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://www.nanopub.org/nschema#hasAssertion https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://www.nanopub.org/nschema#hasProvenance https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#provenance https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://www.nanopub.org/nschema#hasPublicationInfo https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#pubinfo https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.nanopub.org/nschema#Nanopublication https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://arvix.org/abs/2207.15796 https://sense-nets.xyz/hasZoteroItemType webpage https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion http://purl.org/dc/terms/creator https://w3id.org/np/RA8InlmUPoZ6CTtHP_RkqFBHJSnasnRcjI3qz7EJ-nHJY https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion http://purl.org/spar/cito/discusses https://arvix.org/abs/2207.15796 https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion http://purl.org/spar/cito/linksTo https://arvix.org/abs/2207.15796 https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion http://www.w3.org/2000/01/rdf-schema#comment New paper alert! 🚨 We've been exploring the impact of context on LLM performance evaluation. Turns out, evaluating models on individual examples might not tell the whole story. #MachineLearning #AI Our findings suggest that batch evaluation allows models to identify patterns and tendencies, leading to more nuanced assessments. Plus, a two-step decision process (analysis + scoring) shows promising results. Exciting times for ML eval! 📊🧠 To learn more, check out the paper: https://arvix.org/abs/2207.15796 https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://schema.org/keywords AI https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://schema.org/keywords LLM https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://schema.org/keywords MachineLearning https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://schema.org/keywords batch-evaluation https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://schema.org/keywords performance-evaluation https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://schema.org/keywords two-step-decision-process https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion https://sense-nets.xyz/announcesResource https://arvix.org/abs/2207.15796 https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#provenance https://sense-nets.xyz/ http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://www.w3.org/ns/prov#SoftwareAgent https://sense-nets.xyz/ http://www.w3.org/ns/prov#actedOnBehalfOf https://w3id.org/np/RA8InlmUPoZ6CTtHP_RkqFBHJSnasnRcjI3qz7EJ-nHJY https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#activity http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://sense-nets.xyz/supervisedActivity https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#activity http://www.w3.org/ns/prov#wasAssociatedWith https://sense-nets.xyz/ https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion http://www.w3.org/ns/prov#linksTo https://x.com/sensenets_demo/status/1839674524729483541 https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion http://www.w3.org/ns/prov#wasAssociatedWith https://x.com/sensenets_demo https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion http://www.w3.org/ns/prov#wasAttributedTo https://w3id.org/np/RA8InlmUPoZ6CTtHP_RkqFBHJSnasnRcjI3qz7EJ-nHJY https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#assertion http://www.w3.org/ns/prov#wasGeneratedBy https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#activity https://w3id.org/np/RA8InlmUPoZ6CTtHP_RkqFBHJSnasnRcjI3qz7EJ-nHJY http://xmlns.com/foaf/0.1/account https://x.com/sensenets_demo https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#pubinfo https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#sig http://purl.org/nanopub/x/hasAlgorithm RSA https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#sig http://purl.org/nanopub/x/hasPublicKey MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEArHtI92jm8pAYVsvJabxLGfOT+7G0JyJGh2gwjB5x2pFPga6wWTd+rNBWWUZViIFnaJrBEsJpgdnoupLU9ppwn+khMiGRfxqGsDDzwHcj3Jc75CRys7d3etwXdBdoXfBgjsJiZBazwm13idr6tljRrC1TaEJBnRQAqzBw9cLDeGY77cSznzXT39feUGT168dpCSE9O6u/48DvvWVqciHGsH9cQ+LroJJVsMrorwtsdZnAK+q48wtIP6pIpw5shSJ5LnA0qeN/f4TvTFDV6ItYIXjiWWpTECc/Bxmfnyat3B5xWCu9nvz8fEs7Ns0TuzQwT3/K55iSKDEIi/E0nO97xwIDAQAB https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#sig http://purl.org/nanopub/x/hasSignature UfNKMSMxjJF6FmekyAFM3JrDGaiwLMq8OK0b3TsssKDHOlONBvYOzIWsO+Q5sDE3EGKMNcf0L9RvIgAwkaOJd4jgM4DgISHMEE7mOdJJ8+ogj3qea5jjPDXjwPAaUC1v51Hzc7v40LKALWGD3uJEyorHVpAL1z8FO9DlrLbu9sYzQ9zUHxvnRl0fJKGXkzzT1Z5ODlEs5c5/oq2L8LtKlDg5NSW/o2+5ELcKUDXF9cB2qGy8mymmlFXId4D4Q2BuE52/YtOLdRCJQhiEJaC9ZEw8NBUmDILw3NmKj7kOaU9BZzhQkWfc415rNBJSbXxwj6uM5JQIxTX6zUcwTEa9Sw== https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#sig http://purl.org/nanopub/x/hasSignatureTarget https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#sig http://purl.org/nanopub/x/singedBy https://sense-nets.xyz/ https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0#sig http://www.w3.org/ns/prov#wasAssociatedWith https://w3id.org/np/RA8InlmUPoZ6CTtHP_RkqFBHJSnasnRcjI3qz7EJ-nHJYsigningDelegation https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://purl.org/dc/terms/created 2024-09-27T15:38:26.351Z https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://purl.org/dc/terms/creator https://w3id.org/np/RA8InlmUPoZ6CTtHP_RkqFBHJSnasnRcjI3qz7EJ-nHJY https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://purl.org/dc/terms/license https://creativecommons.org/licenses/by/4.0/ https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://purl.org/nanopub/x/hasNanopubType https://sense-nets.xyz/SemanticPost https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://purl.org/nanopub/x/wasCreatedAt https://sense-nets.xyz/ https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 http://www.w3.org/2000/01/rdf-schema#label CoSMO Semantic Post https://w3id.org/np/RA6amI79BFW1axJuNM0PmqefGYO-SWR72EoGG5fb0KuB0 https://sense-nets.xyz/hasRootSigner 0x5b9967FC42C160f6146d5ea1f0d08E88370f370b https://w3id.org/np/RA8InlmUPoZ6CTtHP_RkqFBHJSnasnRcjI3qz7EJ-nHJY http://xmlns.com/foaf/0.1/name Quinn Zhang, PhD