Skip to content
@PKU-Alignment

PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

PKU-Alignment

Large language models (LLM) have immense potential in the field of general intelligence but come with significant risks. As a research team at Peking University, we are actively focusing on alignment techniques for large language models, such as safe-alignment to enhance the model's safety and reduce its toxicity.

Welcome to follow our AI Safety project:

Pinned

  1. omnisafe Public

    OmniSafe is an infrastructural framework for accelerating SafeRL research.

    Python 643 81

  2. Safety-Gymnasium is a highly scalable and customizable safe reinforcement learning environment library.

    Python 132 15

  3. safe-rlhf Public

    Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

    Python 659 40

  4. This is a benchmark repository for safe reinforcement learning algorithms

    Python 198 22

Repositories

  • safe-rlhf Public

    Safe-RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

    Python 659 Apache-2.0 40 4 1 Updated Jul 10, 2023
  • 18 0 0 0 Updated Jul 10, 2023
  • safety-gymnasium Public

    Safety-Gymnasium is a highly scalable and customizable safe reinforcement learning environment library.

    Python 132 Apache-2.0 15 1 3 Updated Jul 9, 2023
  • omnisafe Public

    OmniSafe is an infrastructural framework for accelerating SafeRL research.

    Python 643 Apache-2.0 81 1 0 Updated Jul 6, 2023
  • Safe-Policy-Optimization Public

    This is a benchmark repository for safe reinforcement learning algorithms

    Python 198 Apache-2.0 22 0 0 Updated Jun 30, 2023
  • beavertails Public

    BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).

    12 Apache-2.0 0 0 0 Updated Jun 15, 2023
  • .github Public
    0 0 0 0 Updated May 31, 2023
  • ReDMan Public

    ReDMan is an open-source simulation platform that provides a standardized implementation of safe RL algorithms for Reliable Dexterous Manipulation.

    Python 7 Apache-2.0 0 0 0 Updated May 2, 2023