Skywork-13B

11mos agorelease 321 0 0

Developed by Kunlun World Wide Web, the open source big model, with 13 billion parameters and 3.2 trillion high-quality multi-language training data, has demonstrated excellent natural language processing capabilities in Chinese and other languages, especially in the Chinese environment, and is applicable to a number of domains.

Location:

China

Language:

Collection time:

2024-06-03

Open site Mobile view

Skywork-13B

Open site

Skywork-13B is aOpen SourceThe large model, which is described in more detail below:

Technical features and advantages::

parameter scale::Skywork-13BThe series of large models has13 billion parameters, making it powerful in handling complex natural language tasks.
Training data: The model is in3.2 trillionPre-training on high-quality multilingual (mainly Chinese and English) and coded data ensures broad applicability in multiple languages and cultural contexts.
performance: Skywork-13B performs well in several benchmarks (e.g., C-Eval, MMLU) and comprehensively outperforms similar models such as LLaMA2-13B.
Chinese effectSkywork-13B outperforms all current Chinese open-source models in the Chinese language modeling perplexity evaluation, providing strong support for the Chinese natural language processing field.

Application Areas::

Skywork-13B performs well in many fields such as technology, finance, government, enterprise service, culture and innovation, and gaming, especially in Chinese environment with significant advantages.
Among them, Skywork-13B-Math specializes in mathematical tasks, has undergone intensive training in mathematical skills, and has achieved the best results for models of the same size in datasets such as GSM8K.

Open Source and Commercial::

The Skywork-13B series of large models adopts an open source strategy, opening up the Skywork-13B-Base model, the Skywork-13B-Math model, and its quantized version to support users' deployment and reasoning on consumer graphics cards.
The model offers zero-threshold commercialization with no application required, providing great convenience for developers and businesses.

Data sets and resources::

A large amount of multilingual and code data is used in the training process of Skywork-13B, of which the Skywork-150B dataset is the core, which contains about 150 billion Chinese characters, providing a rich and high-quality corpus for model training.
Kunlun also opened a 600GB, 150B Tokens high-quality Chinese corpus dataset "Skypile/Chinese-Web-Text-150B", which further supports research and application in Chinese environment.

Challenges and prospects::

Although Skywork-13B has shown excellent performance in several aspects, the open source model still faces some challenges, such as how to ensure the security of the open source model and how to deal with the intellectual property rights issues that the open source model may bring.
However, as the technology continues to advance and applications continue to expand, Skywork-13B is expected to continue to be optimized and improved in the future, contributing more to the development of the natural language processing field.

Skywork-13B, as an open-source, powerful and widely used large model for natural language processing, has an important position and value in the field of natural language processing.

data statistics

Relevant Navigation

No comments

No comments...

Skywork-13B

Technical features and advantages::

Application Areas::

Open Source and Commercial::

Data sets and resources::

Challenges and prospects::

data statistics

Relevant Navigation

Grok-1

InspireMusic

DeepClaude

AingDesk

OpenHands

s1

SkyReels-V1

Wan2.1

No comments

Latest Articles

Popular Sites

Skywork-13B

Technical features and advantages::

Application Areas::

Open Source and Commercial::

Data sets and resources::

Challenges and prospects::

data statistics

Relevant Navigation

Grok-1

InspireMusic

DeepClaude

AingDesk

OpenHands

s1

SkyReels-V1

Wan2.1

No comments

Latest Articles

Popular Sites

Tag Cloud