Matches in Nanopublications for { <https://neverblink.eu/ontologies/llm-kg/methods#GRPO> ?p ?o ?g. }
Showing items 1 to 5 of
5
with 100 items per page.
- GRPO type Workflow assertion.
- GRPO type Workflow assertion.
- GRPO label "GRPO" assertion.
- GRPO label "GRPO" assertion.
- GRPO comment "GRPO is a reinforcement learning method that serves as a comparative baseline in the ablation studies, where its performance in boosting LLMs' multi-hop reasoning is contrasted with other fine-tuning and optimization strategies, including the paper's Self-improved Adaptive DPO." assertion.