Writings about the Good, and the Bad, about AI

Today’s News

  1. LLM use reinforcement learning

  2. Is GPT4 becoming less accurate and less useful

  3. Can AI be used to create bioweapons

  4. A good AI product

  5. Some missing information

See the 5 reasons why LLM uses reinforcement learning in the article below. ChatGPT employs Reinforcement Learning from Human Feedback (RLHF) instead of Supervised Learning (SL) for fine-tuning to address several challenges. SL only predicts ranks and fails to produce coherent responses, while RLHF estimates response quality and enables coherent conversations. SL's token-level loss is inadequate for maintaining context and coherence in conversations, where RLHF excels. Empirical evidence shows RLHF performs better, considering cumulative rewards for coherent conversations. The combination of SL and RL is used in models like InstructGPT and ChatGPT, where SL establishes the basic structure and content, while RLHF improves response accuracy.

Is GPT4 less accurate?

The study evaluates GPT-3.5 and GPT-4 in March and June 2023 on diverse tasks. Performance varies over time. GPT-4 (March) excels in prime number identification, but drops drastically in June. GPT-3.5 (June) improves in this task compared to March. GPT-4 becomes less willing to answer sensitive questions in June. Both models make more code generation mistakes in June. Results emphasize the importance of monitoring large language model quality over time.

Can AI be used for creating bioweapons? AI experts suggest AI rules.

Yoshua Bengio suggests international cooperation to regulate AI development like nuclear technology rules. Dario Amodei warns of AI misuse for dangerous bioweapons in two years. Stuart Russell highlights AI's complexity, making it harder to control compared to other technologies.

Good AI (per Terms)

We read the Terms from AI companies, and look to see which look the safest to use (from a Privacy and Terms of service perspective, not the software itself). Does the company safeguard privacy, will it not use your data input to allow itself to use output data for their purposes, and other criteria.

Today’ Good AI software is, Fillout, form creation.

Random information AI will miss!

From behind paywalls, so AI might miss:

  1. Weekly Icodec versus Daily Glargine U100 in Type 2 Diabetes without Previous Insulin

  2. Cilta-cel or Standard Care in Lenalidomide-Refractory Multiple Myeloma

  3. Investigation of a shotcrete accelerator for targeted control of material properties for 3D concrete printing injection method