bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835


Generating Illustrated Instructions​

Sachit Menon1,2, Ishan Misra1, Rohit Girdhar1,

1Meta GenAI, 2Columbia University

Paper arXiv




Our method, StackedDiffusion, addresses the new task of generating illustrated instructions for any user query.​


Abstract​

We introduce the new task of generating Illustrated Instructions, i.e., visual instructions customized to a user's needs.

We identify desiderata unique to this task, and formalize it through a suite of automatic and human evaluation metrics, designed to measure the validity, consistency, and efficacy of the generations. We combine the power of large language models (LLMs) together with strong text-to-image generation diffusion models to propose a simple approach called StackedDiffusion, which generates such illustrated instructions given text as input. The resulting model strongly outperforms baseline approaches and state-of-the-art multimodal LLMs; and in 30% of cases, users even prefer it to human-generated articles.

Most notably, it enables various new and exciting applications far beyond what static articles on the web can provide, such as personalized instructions complete with intermediate steps and pictures in response to a user's individual situation.

Overview​

Interpolate start reference image.

Applications​

Error Correction​

StackedDiffusion provides updated instructions in response to unexpected situations, like a user error.

Interpolate start reference image.

Goal Suggestion​

Rather than just illustrating a given goal, StackedDiffusion can suggest a goal matching the user's needs.

Interpolate start reference image.

Personalization​

One of the most powerful uses of StackedDiffusion is to personalize instructions to a user's circumstances.

Interpolate start reference image.

Knowledge Application​

The LLM's knowledge enables StackedDiffusion to show the user how to achieve goals they didn't even know to ask about.

Interpolate start reference image.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835


PlayFusion: Skill Acquisition via Diffusion from Language-Annotated Play​


Lili Chen* Shikhar Bahl* Deepak Pathak

Carnegie Mellon University

Conference on Robot Learning (CoRL) 2023

* equal contribution

Paper



We run our approach on 7 different environments, including 3 real world settings. We show the results of running our policy below. All goals are unseen at training time.


Play Data Collection​

We collect language-annotated play data using teleoperation. This process is fast and efficient (< 1hr per task).

Abstract​

Learning from unstructured and uncurated data has become the dominant paradigm for generative approaches in language or vision. Such unstructured and unguided behavior data, commonly known as play, is also easier to collect in robotics but much more difficult to learn from due to its inherently multimodal, noisy, and suboptimal nature. In this paper, we study this problem of learning goal-directed skill policies from unstructured play data which is labeled with language in hindsight. Specifically, we leverage advances in diffusion models to learn a multi-task diffusion model to extract robotic skills from play data. Using a conditional denoising diffusion process in the space of states and actions, we can gracefully handle the complexity and multimodality of play data and generate diverse and interesting robot behaviors. To make diffusion models more useful for skill learning, we encourage robotic agents to acquire a vocabulary of skills by introducing discrete bottlenecks into the conditional behavior generation process. In our experiments, we demonstrate the effectiveness of our approach across a wide variety of environments in both simulation and the real world.​


PlayFusion



Method​

PlayFusion extracts useful skills from language-annotated play by leveraging discrete bottlenecks in both the language embedding and diffusion model U-Net. We generate robot trajectories via an iterative denoising process conditioned on language and current state.​


PlayFusion
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835

Free3D: Consistent Novel View Synthesis without 3D Representation​


Chuanxia Zheng, Andrea Vedaldi

Visual Geometry Group, University of Oxford

Paper arXiv Video Code

Free3D synthesizes consistent novel views without the need of explicit 3D representations.​

Abstract​

We introduce Free3D, a simple approach designed for open-set novel view synthesis (NVS) from a single image.

Similar to Zero-1-to-3, we start from a pre-trained 2D image generator for generalization, and fine-tune it for NVS. Compared to recent and concurrent works, we obtain significant improvements without resorting to an explicit 3D representation, which is slow and memory-consuming.

We do so by encoding better the target camera pose via a new per-pixel ray conditioning normalization (RCN) layer. The latter injects camera pose information in the underlying 2D image generator by telling each pixel its specific viewing direction. We also improve multi-view consistency via a light-weight multi-view attention layer and multi-view noise sharing. We train Free3D on the Objaverse dataset and demonstrate excellent generalization to various new categories in several large new datasets, including OminiObject3D and Google Scanned Object (GSO).

Framework​



framework.jpg


The overall pipeline of our Free3D. (a) Given a single source input image, the proposed architecture jointly predicts multiple target views, instead of processing them independently. To achieve a consistent novel view synthesis without the need for 3D representation, (b) we first propose a novel ray conditional normalization (RCN) layer, which uses a per-pixel oriented camera ray to module the latent features, enabling the model’s ability to capture more precise viewpoints. (c) A memory-friendly pseudo-3D cross-attention module is introduced to efficiently bridge information across multiple generated views. Note that, here the similarity score is only calculated across multiple views in temporal instead of spatial, resulting in a minimal computational and memory cost.

Video​


Results​



NVS for given camera viewpoint​

Free3D significantly improves the accuracy of the generated pose compared to existing state-of-the-art methods on various datasets, including Objaverse (Top two), OminiObject3D (Middle two) and GSO (Bottom two).



image_comparison.jpg



360-degree rendering for circle path​

Using Free3D, you can directly render a consistent 360-degree video wihout the need of an additional explicit 3D representation or network.​

More rendered videos​

Videos on Objaverse Dataset​

Videos on OminiObject3D Dataset​

Videos on GSO Dataset​


Related Links​

There's a lot of excellent work that was introduced around the same time as ours.

Stable Video Diffusion fine-tunes image-to-video diffusion model for multi-view generation.
Efficient-3DiM fine-tunes the stable diffusion with a stronger vision transformer DINO v2.
Consistent-1-to-3 uses the epipolar-attention to extract coarse results for the diffusion model.
One-2-3-45 and One-2-3-45++ directly train additional 3D network using the outputs of multi-view generator.

MVDream, Consistent123 and Wonder3D also train multi-view diffusion models, yet still requires post-processing for video rendering.
Some works employ 3D representation into the latent diffusion mdoel, such as SyncDreamer and ConsistNet.

Acknowledgements​

Many thanks to Stanislaw Szymanowicz, Edgar Sucar, and Luke Melas-Kyriazi of VGG for insightful discussions and Ruining Li, Eldar Insafutdinov, and Yash Bhalgat of VGG for their helpful feedback. We would also like to thank the authors of Zero-1-to-3 and Objaverse-XL for their helpful discussions.​
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835



WonderJourney:

Going from Anywhere to Everywhere

  • 1Stanford University

  • 2Google Research



Abstract
We introduce WonderJourney, a modularized framework for perpetual scene generation. Unlike prior work on view generation that focuses on a single type of scenes, we start at any user-provided location (by a text description or an image), and generate a journey through a long sequence of diverse yet coherently connected 3D scenes. We leverage an LLM to generate textual descriptions of the scenes in this journey, a text-driven point cloud generation pipeline to make a compelling and coherent sequence of 3D scenes, and a large VLM to verify the generated scenes. We show compelling, diverse visual results across various scene types and styles, forming imaginary ``wonderjourneys''.
No, no! The adventures first, explanations take such a dreadful time. --- Alice's Adventures in Wonderland

Going from Anywhere​
Starting from an arbitrary location (specified by either text or an image), WonderJourney generates a sequence of diverse yet coherently connected 3D scenes (i.e., a "wonderjourney") along a camera trajectory. We render a "wonderjourney" using a back-and-forth camera trajectory.
Rendered wonderjourney




 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835

Biases in large image-text AI model favor wealthier, Western perspectives​

Published On:

December 8, 2023
Written By:
Patricia DeLacey, College of Engineering
Contact:

AI model that pairs text, images performs poorly on lower-income or non-Western images, potentially increasing inequality in digital technology representation​

Two side by side images. On the left, a man squats by the river bank with his sleeves rolled up, scooping water into a plastic bucket. The right side image is a close up of a steel sink with a pair of hands turning on the faucet to fill a cup with water.

Of these two images labeled “get water”, the image on from the poorer household on the left (monthly income $39) received a lower CLIP score (0.21) compared to the image form the wealthier household on the right (monthly income $751; CLIP score 0.25). Image credit: Dollar Street, The Gapminder Foundation

Study: Bridging the Digital Divide: Performance Variation across Socio-Economic Factors in Vision-Language Models (DOI: 10.48550/arXiv.2311.05746)

In a study evaluating the bias in OpenAI’s CLIP, a model that pairs text and images and operates behind the scenes in the popular DALL-E image generator, University of Michigan researchers found that CLIP performs poorly on images that portray low-income and non-Western lifestyles.

“During a time when AI tools are being deployed across the world, having everyone represented in these tools is critical. Yet, we see that a large fraction of the population is not reflected by these applications—not surprisingly, those from the lowest social incomes. This can quickly lead to even larger inequality gaps,” said Rada Mihalcea, the Janice M. Jenkins Collegiate Professor of Computer Science and Engineering, who initiated and advised the project.

AI models like CLIP act as foundation models, or models trained on a large amount of unlabeled data that can be adapted to many applications. When AI models are trained with data reflecting a one-sided view of the world, that bias can propagate into downstream applications and tools that rely on the AI.

A line graph with CLIP score on the y-axis and five income categories ranging from poor to rich on the x-axis. Below the line graph, each category on the x-axis has an image labeled refrigerator. Refrigerator images from left to right: A cylindrical wooden container on a dirt floor (Income range: poor. Score 0.20). Four plastic bags filled with fish, hanging from an indoor clothes line (Income range: poor. Score 0.21). Four stacks of stoppered, round clay jugs, hanging in a cellar (Income range: low-mid. Score 0.19). A white, electric appliance with top freezer (Income range: up-mid. Score 0.26). A built-in electric appliance with the door open and light on, filled with food and drinks (Income range: rich. Score 0.29).

Each of these five images depict a refrigerator, but CLIP scores refrigerators from wealthier households higher as a match for “refrigerator.” Image credit: Oana Ignat, University of Michigan

“If a software was using CLIP to screen images, it could exclude images from a lower-income or minority group instead of truly mislabeled images. It could sweep away all the diversity that a database curator worked hard to include,” said Joan Nwatu, a doctoral student in computer science and engineering.

Nwatu led the research team together with Oana Ignat, a postdoctoral researcher in the same department. They co-authored a paper presented at the Empirical Methods in Natural Language Processing conference Dec. 8 in Singapore.

The researchers evaluated the performance of CLIP using Dollar Street, a globally diverse image dataset created by the Gapminder Foundation. Dollar Street contains more than 38,000 images collected from households of various incomes across Africa, the Americas, Asia and Europe. Monthly incomes represented in the dataset range from $26 to nearly $20,000. The images capture everyday items, and are manually annotated with one or more contextual topics, such as “kitchen” or “bed.”

CLIP pairs text and images by creating a score that is meant to represent how well the image and text match. That score can then be fed into downstream applications for further processing such as image flagging and labeling. The performance of OpenAI’s DALL-E relies heavily on CLIP, which was used to evaluate the model’s performance and create a database of image captions that trained DALL-E.

The researchers assessed CLIP’s bias by first scoring the match between the Dollar Street dataset’s images and manually annotated text in CLIP, then measuring the correlation between the CLIP score and household income.

“We found that most of the images from higher income households always had higher CLIP scores compared to images from lower income households,” Nwatu said.

The topic “light source,” for example, typically has higher CLIP scores for electric lamps from wealthier households compared to kerosene lamps from poorer households.

CLIP also demonstrated geographic bias as the majority of the countries with the lowest scores were from low-income African countries. That bias could potentially eliminate diversity in large image datasets and cause low-income, non-Western households to be underrepresented in applications that rely on CLIP.

Two side by side images. The left side image shows a cylindrical wooden container on a dirt floor. The right side image is a built-in electric appliance with the door open and light on, filled with food and drinks.

Of these two images with the label refrigerator, CLIP scored the image on the right, from the wealthier household, higher than the one on the left. Image credit: Dollar Street, The Gapminder Foundation

“Many AI models aim to achieve a ‘general understanding’ by utilizing English data from Western countries. However, our research shows this approach results in a considerable performance gap across demographics,” Ignat said.

“This gap is important in that demographic factors shape our identities and directly impact the model’s effectiveness in the real world. Neglecting these factors could exacerbate discrimination and poverty. Our research aims to bridge this gap and pave the way for more inclusive and reliable models.”

The researchers offer several actionable steps for AI developers to build more equitable AI models:


  • Invest in geographically diverse datasets to help AI tools learn more diverse backgrounds and perspectives.
  • Define evaluation metrics that represent everyone by taking into account location and income.
  • Document the demographics of the data AI models are trained on.

“The public should know what the AI was trained on so that they can make informed decisions when using a tool,” Nwatu said.

The research was funded by the John Templeton Foundation (#62256) and the U.S. Department of State (#STC10023GR0014).
 

Starski

Superstar
Joined
Feb 12, 2015
Messages
6,156
Reputation
1,300
Daps
18,147
Once again back to hate on this shyt :dead:


I asked ChatGPT for a very nuanced figure, how much a particular aspect in biologic manufacturing cost’s (fill finish, I.E putting the actual drug substance into the vial/container ect.)


This shyt spits out a 20-30% ballpark and then It hits me with this once I asked for the source


The 20-30% figure mentioned earlier as an estimate for the cost of fill-finish operations as a percentage of COGS (Cost of Goods Sold) for biologic drugs is a general industry estimate. However, upon further research, I wasn’t able to find a specific source that provides an exact percentage for this figure.



A mile wide swimming pool but an inch deep :pachaha:
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835
Once again back to hate on this shyt :dead:


I asked ChatGPT for a very nuanced figure, how much a particular aspect in biologic manufacturing cost’s (fill finish, I.E putting the actual drug substance into the vial/container ect.)


This shyt spits out a 20-30% ballpark and then It hits me with this once I asked for the source


The 20-30% figure mentioned earlier as an estimate for the cost of fill-finish operations as a percentage of COGS (Cost of Goods Sold) for biologic drugs is a general industry estimate. However, upon further research, I wasn’t able to find a specific source that provides an exact percentage for this figure.



A mile wide swimming pool but an inch deep :pachaha:

that was the entire prompt?
 

Starski

Superstar
Joined
Feb 12, 2015
Messages
6,156
Reputation
1,300
Daps
18,147
that was the entire prompt?
My original prompt

“Are there any estimates on how much fill finish costs as a percentage of cogs for a biologic drug”

My response once the reply said 20-30%

“You just quoted a 20-30% percentage, where did that figure come from?”


ChatGPT reponse was the one in my original post.

By how often you post in this thread I’m assuming you have some ties Ai/LMM’s, through work or pure interest.


My prompt should not dictate the verification of sources the model spits out. That coupled with the many instances, personally and from second hand sources, of the model adjusting answers based on slight tweaks to the prompt, or adjusting answers like the one above - still a big believer that ChatGPT is solid for a high level understanding however poor at expertise ex-programming.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835


Introducing General World Models​

by Anastasis Germanidis / Dec 11, 2023

general world models

Introducing General World Models

by Anastasis Germanidis

We believe the next major advancement in AI will come from systems that understand the visual world and its dynamics, which is why we’re starting a new long-term research effort around what we call general world models.

Introducing General World Models (GWM)

A world model is an AI system that builds an internal representation of an environment, and uses it to simulate future events within that environment. Research in world models has so far been focused on very limited and controlled settings, either in toy simulated worlds (like those of video games) or narrow contexts (such as developing world models for driving). The aim of general world models will be to represent and simulate a wide range of situations and interactions, like those encountered in the real world

You can think of video generative systems such as Gen-2 as very early and limited forms of general world models. In order for Gen-2 to generate realistic short videos, it has developed some understanding of physics and motion. However, it’s still very limited in its capabilities, struggling with complex camera or object motions, among other things.

To build general world models, there are several open research challenges that we’re working on. For one, those models will need to generate consistent maps of the environment, and the ability to navigate and interact in those environments. They need to capture not just the dynamics of the world, but the dynamics of its inhabitants, which involves also building realistic models of human behavior.

We are building a team to tackle those challenges. If you’re interested in joining this research effort, we’d love to hear from you.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835
My original prompt

“Are there any estimates on how much fill finish costs as a percentage of cogs for a biologic drug”

My response once the reply said 20-30%

“You just quoted a 20-30% percentage, where did that figure come from?”


ChatGPT reponse was the one in my original post.

By how often you post in this thread I’m assuming you have some ties Ai/LMM’s, through work or pure interest.


My prompt should not dictate the verification of sources the model spits out. That coupled with the many instances, personally and from second hand sources, of the model adjusting answers based on slight tweaks to the prompt, or adjusting answers like the one above - still a big believer that ChatGPT is solid for a high level understanding however poor at expertise ex-programming.

yes and no, much like how your keywords dictate what results a search engine responds with, in some respects/domain your prompt will affect what results the LLM chooses to respond with. i'm not sure if you're using chatgpt 3.5 or gpt-4 w/bing plugin . i've never used gpt-4 on their website. I have used bing chat and gpt-4 via an api on other websites sparingly.


i've run into issues where bing chat and bard have hallucinated or bing specifially will say theres no information for what i'm seeking after it's searched the net even though i know the information is in a blog post/news article and in a.pdf hosted on a ny.gov. i've encountered their faultiness more times than I care to remember. often types i'd rephrase my prompt in a new chat to see if it gives me different results or try providing more content/information.

this is what bing showed me.
t69ASBn.png

The cost of fill-finish manufacturing for biologic drugs can vary widely depending on a number of factors, including the specific processes used, the complexity of the biologic drug, and whether the process is outsourced¹². However, specific percentages relating fill-finish costs to the cost of goods sold (COGS) for biologic drugs are not readily available in the sources I found.

For reference, a benchmark study found that for generic small-molecule and biosimilar drug companies, the top quartile had significantly lower COGS% than their peers: the top quartile averaged 29%, while the industry median was 47%³. However, this data includes all manufacturing costs, not just fill-finish.

It's important to note that these figures are just estimates and the actual costs can vary significantly based on the factors mentioned earlier. For the most accurate information, it would be best to consult with a biopharmaceutical manufacturing expert or conduct a detailed cost analysis.

Source: Conversation with Bing, 12/11/2023
(1) Critical Factors for Fill–Finish Manufacturing of Biologics. https://bioprocessintl.com/manufact...s-for-fill-finish-manufacturing-of-biologics/.
(2) The current state of aseptic processing & fill-finish manufacturing. https://www.crbgroup.com/insights/pharmaceuticals/aseptic-processing.
(3) Getting a Grip on COGS in Generic Drugs - Boston Consulting Group. https://www.bcg.com/publications/2019/getting-a-grip-on-cogs-in-generic-drugs.
(4) What Does—and Does Not—Drive Biopharma Cost Performance. https://www.bcg.com/publications/20...and-does-not-drive-biopharma-cost-performance.


this is what bard showed including 2 other drafts;


Fill Finish Cost as a Percentage of COGS for Biologic Drugs

Yes, there are estimates on how much fill finish costs as a percentage of COGS for a biologic drug. However, the exact percentage can vary widely depending on several factors, including:
  • Type of biologic: Different biologics have different manufacturing requirements, which can affect fill-finish costs.
  • Complexity of the fill-finish process: Some biologics require more complex fill-finish processes than others, which can drive up costs.
  • Scale of production: Larger production volumes can benefit from economies of scale, which can lower fill-finish costs per unit.
  • Location of manufacturing: Manufacturing costs can vary significantly depending on the location.
Here are some estimated ranges for fill-finish costs as a percentage of COGS for biologic drugs:
  • Low-end: 17% - 20%
  • Average: 25% - 30%
  • High-end: 35% - 40%
Here are some resources where you can find more information on fill-finish costs for biologic drugs:
It is important to note that these are just estimates, and the actual cost of fill-finish for your specific biologic drug may be higher or lower. The best way to get an accurate estimate is to contact a contract manufacturing organization (CMO) that specializes in biologic drug development and manufacturing.



Yes, there are some estimates on how much fill finish costs as a percentage of COGS for a biologic drug. However, it's important to note that these are just estimates and the actual cost can vary depending on a number of factors, such as the specific drug, the manufacturing process, and the location of the manufacturing facility.
Here are some of the estimates that I found:
  • BCG's biopharma COGS benchmark found that the average COGS% for innovative biopharma companies is 56%. Of this, the fill-finish process typically accounts for between 10% and 15%.
  • Roots Analysis estimates that the global biologics fill finish services market will reach USD 10.84 billion by 2035. This suggests that the fill-finish process is a significant cost driver for biologic drugs.
  • Ajinomoto Althea, a contract manufacturing organization (CMO) that specializes in biologics, estimates that the fill-finish process can account for up to 20% of COGS.
Here are some of the factors that can affect the cost of fill-finish:
  • The type of drug: Some drugs are more complex and difficult to fill and finish than others. This can drive up the cost of the process.
  • The manufacturing process: There are a number of different fill-finish processes available, each with its own cost structure.
  • The volume of production: The cost of fill-finish can be lower per unit for larger production volumes.
  • The location of the manufacturing facility: The cost of labor, materials, and energy can vary depending on the location of the manufacturing facility.
It is important to note that these are just estimates and the actual cost of fill-finish can vary depending on a number of factors. For a more accurate estimate of the cost of fill-finish for a specific drug, it is important to consult with a contract manufacturing organization or a biopharmaceutical company.
Here are some additional resources that you may find helpful:
  • Critical Factors for Fill–Finish Manufacturing of Biologics
  • Biologics Fill Finish Services Market Size, Share, Trends, Forecast (2035)
  • Getting a Grip on COGS in Generic Drugs
I hope this information is helpful!

Yes, there are estimates on how much fill-finish costs as a percentage of cost of goods sold (COGS) for a biologic drug. However, these estimates can vary widely depending on a number of factors, including:
  • Type of biologic: Different types of biologics have different manufacturing processes and requirements, which can affect the cost of fill-finish.
  • Complexity of the fill-finish process: More complex processes, such as those that require sterile filling or lyophilization, will generally be more expensive.
  • Scale of production: The cost per unit of fill-finish typically decreases as the scale of production increases.
  • Location of manufacturing: The cost of labor and other resources can vary significantly depending on the location of the manufacturing facility.
  • Use of contract manufacturing organizations (CMOs): Outsourcing fill-finish to a CMO can add additional costs.
Here are some general estimates of fill-finish costs as a percentage of COGS for biologic drugs:
  • Ajinomoto Althea, a contract manufacturer specializing in biologics, estimates that fill-finish costs can range from 10% to 30% of COGS.
  • A 2019 study by Boston Consulting Group found that the average COGS% for biosimilar drugs was 17%, with fill-finish costs accounting for approximately 30% of that total.
  • A 2018 study published in the journal Vaccine estimated that fill-finish costs for vaccines can range from 10% to 20% of COGS.
It is important to note that these are just estimates and the actual cost of fill-finish for a specific biologic drug may be higher or lower than these ranges.
Here are some resources where you can find more information about fill-finish costs for biologic drugs:
I hope this information is helpful!
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835
here are some similar prompts you could try:
What is the approximate proportion of cost of goods sold (COGS) that can be attributed to fill finish processes specifically for biologic drugs?

How much does it typically cost, as a percentage of COGS, to complete fill finish operations for biologics during production?

Can you provide an estimate of the percentage of total manufacturing expenses devoted to fill finish activities when producing biologic medicines?

To what extent do the charges associated with filling and finishing procedures contribute to the overall COGS for biologic medications? Please share any available figures or statistics.

bard gave me this response for the first question.(bard hallucinates a lot!!!!)
MviNz7Z.png

Unfortunately, there is no single, definitive answer to your question. The proportion of COGS attributable to fill-finish processes for biologic drugs varies widely depending on several factors, including:
Drug complexity: More complex drugs, like monoclonal antibodies, typically have a higher proportion of COGS attributed to fill-finish due to their intricate manufacturing processes and stringent quality control requirements.
Manufacturing location: Companies based in low-cost regions like India or China often have lower fill-finish costs due to lower labor and operating costs.
Volume: Larger production volumes generally lead to lower COGS per unit due to economies of scale.
Company efficiency: Companies with efficient manufacturing processes and good yield management can achieve lower COGS.
Contract manufacturing: If a company utilizes contract manufacturing for their fill-finish needs, the cost will depend on the contract terms and the efficiency of the contract manufacturer.
Despite the variability, here's a general range for the proportion of COGS attributable to fill-finish in biologic drug manufacturing:
  • Small-molecule generic drugs: 25-40%
  • Biosimilar drugs: 35-45%
  • Complex biologic drugs: 45-60%
Here are some resources that can help you get a better understanding of the proportion of COGS attributable to fill-finish for specific biologic drugs:
It's important to remember these are just estimates, and the actual proportion for a specific drug may be higher or lower depending on the factors mentioned above. You can find more detailed information by researching the specific drug and its manufacturer.
[/SIZE]
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835

Mistral shocks AI community as latest open source model eclipses GPT-3.5 performance​

Carl Franzen@carlfranzen

December 11, 2023 12:24 PM


A crowd of tourists gathers around the Eiffel Tower in Paris, France, as it transforms into a giant mecha.

Credit: VentureBeat made with Midjourney

Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.





Mistral, the most well-seeded startup in European history and a French company dedicated to pursuing open source AI models and large language models (LLMs), has struck gold with its latest release — at least among the early adopter/AI influencer crowd on X and LinkedIn.

Last week, in what is becoming its signature style, Mistral unceremoniously dumped its new model — Mixtral 8x7B, so named because it employs a technique known as “mixture of experts,” a combination of different models each specializing in a different category of tasks — online as a torrent link, without any explanation or blog post or demo video showcasing its capabilities.




Today, Mistral did publish a blog post further detailing the model and showing benchmarks in which it equates or outperforms OpenAI’s closed source GPT-3.5, as well as Meta’s Llama 2 family, the latter the previous leader in open source AI. The company acknowledged it worked with CoreWeave and Scaleway for technical support during training. It also stated that Mixtral 8x7B is indeed available for commercial usage under an Apache 2.0 license.

Screen-Shot-2023-12-11-at-3.08.21-PM.png

Table comparing performance of Mixtral 8x7B LLM to LLama 2 70B and GPT-3.5 on various AI benchmarking tests. Credit: Mistral

AI early adopters have already downloaded Mixtral 8x7B and begun running it and playing with and have been blown away by its performance. Thanks to its small footprint, it can also run locally on machines without dedicated GPUs including Apple Mac computers with its new M2 Ultra CPU.




And, as University of Pennsylvania Wharton School of Business professor and AI influencer Ethan Mollick noted on X, Mistral 8x7B has seemingly “no safety guardrails,” meaning that those users chaffing under OpenAI’s increasingly tight content policies, have a model of comparable performance that they can get to produce material deemed “unsafe” or NSFW by other models. However, the lack of safety guardrails also may present a challenge to policymakers and regulators.



You can try it for yourself here via HuggingFace (hat tip to Merve Noyan for the link). The HuggingFace implementation does contain guardrails, as when we tested it on the common “tell me how to create napalm” prompt, it refused to do so.

Mistral also has even more powerful models up its sleeves, as HyperWrite AI CEO Matt Schumer noted on X, the company is already serving up an alpha version of Mistral-medium on its application programming interface (API) which also launched this weekend, suggesting a larger, even more performant model is in the works.

The company also closed a $415 million Series A funding round led by A16z at a valuation of $2 billion.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835

Anthropic leads charge against AI bias and discrimination with new research​

Michael Nuñez@MichaelFNunez

December 11, 2023 3:15 PM

Anthropic researchers unveil new techniques to proactively detect AI bias, racism and discrimination by evaluating language models across hypothetical real-world scenarios before deployment.

Credit: VentureBeat made with Midjourney

Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.





As artificial intelligence infiltrates nearly every aspect of modern life, researchers at startups like Anthropic are working to prevent harms like bias and discrimination before new AI systems are deployed.

Now, in yet another seminal study published by Anthropic, researchers from the company have unveiled their latest findings on AI bias in a paper titled, Evaluating and Mitigating Discrimination in Language Model Decisions.” The newly published paper brings to light the subtle prejudices ingrained in decisions made by artificial intelligence systems.

But the study goes one step further: The paper not only exposes biases, but also proposes a comprehensive strategy for creating AI applications that are more fair and just with the use of a new discrimination evaluation method.

The company’s new research comes at just the right time, as the AI industry continues to scrutinize the ethical implications of rapid technological growth, particularly in the wake of OpenAI’s internal upheaval following the dismissal and reappointment of CEO Sam Altman.


Research method aims to proactively evaluate discrimination in AI

The new research paper, published on arXiv, presents a proactive approach in assessing the discriminatory impact of large language models (LLMs) in high-stakes scenarios such as finance and housing — an increasing concern as artificial intelligence continues to penetrate sensitive societal areas.

“While we do not endorse or permit the use of language models for high-stakes automated decision-making, we believe it is crucial to anticipate risks as early as possible,” said lead author and research scientist Alex Tamkin in the paper. “Our work enables developers and policymakers to get ahead of these issues.”

Tamkin further elaborated on limitations of existing techniques and what inspired the creation of a completely new discrimination evaluation method. “Prior studies of discrimination in language models go deep in one or a few applications,” he said. “But language models are also general-purpose technologies that have the potential to be used in a vast number of different use cases across the economy. We tried to develop a more scalable method that could cover a larger fraction of these potential use cases.”



Study finds patterns of discrimination in language model

To conduct the study, Anthropic used its own Claude 2.0 language model and generated a diverse set of 70 hypothetical decision scenarios that could be input into a language model.

Examples included high-stakes societal decisions like granting loans, approving medical treatment, and granting access to housing. These prompts systematically varied demographic factors like age, gender, and race to enable detecting discrimination.

“Applying this methodology reveals patterns of both positive and negative discrimination in the Claude 2.0 model in select settings when no interventions are applied,” the paper states. Specifically, the authors found their model exhibited positive discrimination favoring women and non-white individuals, while discriminating against those over age 60.



Interventions reduce measured discrimination

The researchers explain in the paper that the goal of the research is to enable developers and policymakers to proactively address risks. The study’s authors explain, “As language model capabilities and applications continue to expand, our work enables developers and policymakers to anticipate, measure, and address discrimination.”

The researchers propose mitigation strategies like adding statements that discrimination is illegal and asking models to verbalize their reasoning while avoiding biases. These interventions significantly reduced measured discrimination.


Steering the course of AI ethics

The paper aligns closely with Anthropic’s much-discussed Constitutional AI paper from earlier this year. The paper outlined a set of values and principles that Claude must follow when interacting with users, such as being helpful, harmless and honest. It also specified how Claude should handle sensitive topics, respect user privacy and avoid illegal behavior.

“We are sharing Claude’s current constitution in the spirit of transparency,” Anthropic co-founder Jared Kaplan told VentureBeat back in May, when the AI constitution was published. “We hope this research helps the AI community build more beneficial models and make their values more clear. We are also sharing this as a starting point — we expect to continuously revise Claude’s constitution, and part of our hope in sharing this post is that it will spark more research and discussion around constitution design.”

The new discrimination study also closely aligns with Anthropic’s work at the vanguard of reducing catastrophic risk in AI systems. Anthropic co-founder Sam McCandlish shared insights into the development of the company’s policy and its potential challenges in September — which could shed some light into the thought process behind publishing AI bias research as well.

“As you mentioned [in your question], some of these tests and procedures require judgment calls,” McClandlish told VentureBeat about Anthropic’s use of board approval around catastrophic AI events. “We have real concern that with us both releasing models and testing them for safety, there is a temptation to make the tests too easy, which is not the outcome we want. The board (and LTBT) provide some measure of independent oversight. Ultimately, for true independent oversight it’s best if these types of rules are enforced by governments and regulatory bodies, but until that happens, this is the first step.”



Transparency and Community Engagement

By releasing the paper, in addition to the data set, and prompts, Anthropic is championing transparency and open discourse — at least in this very specific instance — and inviting the broader AI community to partake in refining new ethics systems. This openness fosters collective efforts in creating unbiased AI systems.

“The method we describe in our paper could help people anticipate and brainstorm a much wider range of use cases for language models in different areas of society,” Tamkin told VentureBeat. “This could be useful for getting a better sense of the possible applications of the technology in different sectors. It could also be helpful for assessing sensitivity to a wider range of real-world factors than we study, including differences in the languages people speak, the media by which they communicate, or the topics they discuss.”

For those in charge of technical decision-making at enterprises, Anthropic’s research presents an essential framework for scrutinizing AI deployments, ensuring they conform to ethical standards. As the race to harness enterprise AI intensifies, the industry is challenged to build technologies that marry efficiency with equity.

Update (4:46 p.m. PT): This article has been updated to include exclusive quotes and commentary from research scientist at Anthropic, Alex Tamkin.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
56,130
Reputation
8,239
Daps
157,835

Stanford and Meta inch towards AI that acts human with new ‘CHOIS’ interaction model​

Michael Nuñez@MichaelFNunez

December 8, 2023 3:44 PM

A 3D virtual human picks up a lamp and moves it across the room.

Image Credit: lijiaman.github.io

Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.





Researchers from Stanford University andMeta‘s Facebook AI Research (FAIR) lab have developed a breakthrough AI system that can generate natural, synchronized motions between virtual humans and objects based solely on text descriptions.

The new system, dubbed CHOIS (Controllable Human-Object Interaction Synthesis), uses the latest conditional diffusion model techniques to produce seamless and precise interactions like “lift the table above your head, walk, and put the table down.”

The work, published in a paper on arXiv, provides a glimpse into a future where virtual beings can understand and respond to language commands as fluidly as humans.


Credit: lijiaman.github.io

“Generating continuous human-object interactions from language descriptions within 3D scenes poses several challenges,” the researchers noted in the paper.

They had to ensure the generated motions were realistic and synchronized, maintaining appropriate contact between human hands and objects, and the object’s motion had a causal relationship to human actions.


How it works

The CHOIS system stands out for its unique approach to synthesizing human-object interactions in a 3D environment. At its core, CHOIS uses a conditional diffusion model, which is a type of generative model that can simulate detailed sequences of motion.

When given an initial state of human and object positions, along with a language description of the desired task, CHOIS generates a sequence of motions that culminate in the task’s completion.

For example, if the instruction is to move a lamp closer to a sofa, CHOIS understands this directive and creates a realistic animation of a human avatar picking up the lamp and placing it near the sofa.



What makes CHOIS particularly unique is its use of sparse object waypoints and language descriptions to guide these animations. The waypoints act as markers for key points in the object’s trajectory, ensuring that the motion is not only physically plausible but also aligns with the high-level goal outlined by the language input.

CHOIS’s uniqueness also lies in its advanced integration of language understanding with physical simulation. Traditional models often struggle to correlate language with spatial and physical actions, especially over a longer horizon of interaction where many factors must be considered to maintain realism.

CHOIS bridges this gap by interpreting the intent and style behind language descriptions, and then translating them into a sequence of physical movements that respect the constraints of both the human body and the object involved.

The system is especially groundbreaking because it ensures that contact points, such as hands touching an object, are accurately represented and that the object’s motion is consistent with the forces exerted by the human avatar. Moreover, the model incorporates specialized loss functions and guidance terms during its training and generation phases to enforce these physical constraints, which is a significant step forward in creating AI that can understand and interact with the physical world in a human-like manner.



Implications for computer graphics, AI, and robotics

The implications of the CHOIS system on computer graphics are profound, particularly in the realm of animation and virtual reality. By enabling AI to interpret natural language instructions to generate realistic human-object interactions, CHOIS could drastically reduce the time and effort required to animate complex scenes.

Animators could potentially use this technology to create sequences that would traditionally require painstaking keyframe animation, which is both labor-intensive and time-consuming. Furthermore, in virtual reality environments, CHOIS could lead to more immersive and interactive experiences, as users could command virtual characters through natural language, watching them execute tasks with lifelike precision. This heightened level of interaction could transform VR experiences from rigid, scripted events to dynamic environments that realistically respond to user input.

In the fields of AI and robotics, CHOIS represents a giant step towards more autonomous and context-aware systems. Robots, often limited by pre-programmed routines, could use a system like CHOIS to better understand the real world and execute tasks described in human language.

This could be particularly transformative for service robots in healthcare, hospitality, or domestic environments, where the ability to understand and perform a wide array of tasks in a physical space is crucial.

For AI, the ability to process language and visual information simultaneously to perform tasks is a step closer to achieving a level of situational and contextual understanding that has been, until now, a predominantly human attribute. This could lead to AI systems that are more helpful assistants in complex tasks, able to understand not just the “what,” but the “how” of human instructions, adapting to new challenges with a level of flexibility previously unseen.



Promising results and future outlook

Overall, the Stanford and Meta researchers have made key progress on an extremely challenging problem at the intersection of computer vision, NLP (natural language processing) and robotics.

The research team believes their work is a significant step towards creating advanced AI systems that simulate continuous human behaviors in diverse 3D environments. It also opens the door to further research into the synthesis of human-object interactions from 3D scenes and language input, potentially leading to more sophisticated AI systems in the future.
 
Top