Студенческий форум -> (2014) DaDianNao: A Machine-Learning Supercomputer

Помощь

Поиск

Участники

Календарь

Новости

Учебные Материалы

ВАЛтест

Фотогалерея

Правила форума

Виртуальные тренажеры

Мемуары

Здравствуйте Гость ( Вход | Регистрация )

Выслать повторно письмо для активации

Студенческий форум -> Студенческие форумы НИЯУ МИФИ -> Магистры МИФИ и РУДН: программы, тематика курсов, советы -> CPU (процессоры), компьютеры, сети

(2014) DaDianNao: A Machine-Learning Supercomputer

Подписка на тему | Сообщить другу | Версия для печати

VAL

Дата 11.02.2019 21:19

Offline

Мэтр, проФАН любви... proFAN of love

Профиль
Группа: Администраторы
Сообщений: 38060
Пользователь №: 1
Регистрация: 6.03.2004

(2014) DaDianNao: A Machine-Learning Supercomputer
Источники:
- https://dl.acm.org/citation.cfm?id=2742217
- http://pages.saclay.inria.fr/olivier.temam...percomputer.pdf - статья
- https://ieeexplore.ieee.org/document/7011421
- Proceeding MICRO-47 Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, Pages 609-622
DOI>10.1109/MICRO.2014.58

Авторы: Yunji Chen1, Tao Luo1,3, Shaoli Liu1, Shijin Zhang1, Liqiang He2,4, Jia Wang1, Ling Li1,Tianshi Chen1, Zhiwei Xu1, Ninghui Sun1, Olivier Temam2

QUOTE

Abstract:

Many companies are deploying services, either for consumers or industry, which are largely based on machine-learning algorithms for sophisticated processing of large amounts of data. The state-of-the-art and most popular such machine-learning algorithms are Convolutional and Deep Neural Networks (CNNs and DNNs), which are known to be both computationally and memory intensive. A number of neural network accelerators have been recently proposed which can offer high computational capacity/area ratio, but which remain hampered by memory accesses. However, unlike the memory wall faced by processors on general-purpose workloads, the CNNs and DNNs memory footprint, while large, is not beyond the capability of the on chip storage of a multi-chip system. This property, combined with the CNN/DNN algorithmic characteristics, can lead to high internal bandwidth and low external communications, which can in turn enable high-degree parallelism at a reasonable area cost. In this article, we introduce a custom multi-chip machine-learning architecture along those lines. We show that, on a subset of the largest known neural network layers, it is possible to achieve a speedup of 450.65x over a GPU, and reduce the energy by 150.31x on average for a 64-chip system. We implement the node down to the place and route at 28nm, containing a combination of custom storage and computational units, with industry-grade interconnects.

--------------------

www.valinfo.ru
Всегда... Always....
Quod licet jovi, non licet bovi!

VAL	Дата 18.08.2020 18:51
Offline Мэтр, проФАН любви... proFAN of love Профиль Группа: Администраторы Сообщений: 38060 Пользователь №: 1 Регистрация: 6.03.2004	* -------------------- www.valinfo.ru Всегда... Always.... Quod licet jovi, non licet bovi!**

1 Пользователей читают эту тему (1 Гостей и 0 Скрытых Пользователей)

0 Пользователей:

« Предыдущая тема | CPU (процессоры), компьютеры, сети | Следующая тема »

Powered by Invision Power Board(U) v1.3 Final © 2003 IPS, Inc.
Установка, модификация и поддержка:
Barsum | 1px Design Group & Xac | OппаRU форум