**2018**

Automatic determination of cognitive models for deployment at computerized devices having various hardware constraints
Scheidegger Florian M., Istrate Roxana, Mariani Giovanni, Bekas Konstantinos, Malossi A. Cristiano I.
US20200193266A1
Abstract
Determining cognitive models to be deployed at auxiliary devices may include maintaining relations, e.g., in a database. The relations map hardware characteristics of auxiliary devices and example datasets to cognitive models. Cognitive models are determined for auxiliary devices, based on said relations, e.g., for each of the auxiliary devices. An input dataset is accessed, which comprises data of interest, e.g., collected at a core computing system (CCS), and hardware characteristics of each of the auxiliary devices. An auxiliary cognitive model is determined based on a core cognitive model run on the input dataset accessed, wherein the core cognitive model has been trained to learn at least part of said relations. Parameters of the auxiliary model determined can be communicated to said each of the auxiliary devices for the latter to deploy the auxiliary model determined. Method may be implemented in a network having an edge computing architecture.

Creating optimized machine-learning models
Thomas Gegi, Malossi A. Cristiano I., Pedapati Tejaswini, Venkataraman Ganesh, Istrate Roxana, Wistuba Martin, Scheidegger Florian M., Xue Chao, Yan Rong, Samulowitz Horst C., Herta Benjamin, Saha Debashish, Strobelt Hendrik
US20200184380A1
Abstract
A machine-learning model generation method, system, and computer program product deciding, via a first algorithm, a machine-learning algorithm that is best for customer data, invoking the machine-learning algorithm to train a neural network model with the customer data, analyzing the neural network model produced by the training for an accuracy, and improving the accuracy by iteratively repeating the training of the neural network model until a customer-defined constraint is met, as determined by the first algorithm.

**2017**

Higher accuracy of non-volatile memory-based vector multiplication
Bekas Konstantinos, Curioni Alessandro, Eleftheriou Evangelos Stavros, Le Gallo-Bourdeau Manuel, Malossi A. Cristiano I., Sebastian Abu
US10614150B2
Abstract
A multiplication device for performing a matrix-vector-multiplication may be provided. The multiplication device comprises a memristive crossbar array comprising a plurality of memristive devices. The device comprises a decomposition unit adapted for decomposing a matrix into a partial sum of multiple sub-matrices, and decomposing a vector into a sum of multiple sub-vectors, a programming unit adapted for programming the plurality of the memristive devices with values representing elements of the sub-matrices such that each one of the memristive devices corresponds to one of the elements of the sub-matrices, an applying unit adapted for applying elements of one of the multiple sub-vectors as input values to the memristive crossbar array to input lines of the memristive crossbar array resulting in partial results at output lines of the memristive crossbar array, and a summing unit adapted for scaling and summing the partial results building the product of the matrix and the vector.

**2016**

Logarithm and Power (Exponentiation) Computations Using Modern Computer Architectures
Bekas Konstantinos, Curioni Alessandro, Ineichen Yves G, Malossi A. Cristiano I.
US Patent App. 15/138,846
Abstract
Embodiments of the present invention may provide the capability to evaluate logarithm and power (exponentiation) functions using either hardware specific instructions, or a hardware specific implementation with reduced memory requirements. An input comprising a floating point representation of a real number may be received and a mantissa and an exponent may be extracted. A function of a logarithm of a mantissa of the real number may be approximated by utilizing a polynomial based on the mantissa. The approximated function of the logarithm may be combined with the exponent for calculating a value comprising a logarithm of the real number. Likewise, an input comprising a floating point representation of a real number and a representation of a second number may be received and an approximation of the real number to the power of the second number may be generated.

**2015**

Fast, energy-efficient exponential computations in simd architectures
Bekas, Konstantinos and Curioni, Alessandro and Ineichen, Yves and Malossi, A. Cristiano I.
US Patent App. 14/745,499
Abstract
In one embodiment, a computer-implemented method includes receiving as input a value of a variable x and receiving as input a degree n of a polynomial function being used to evaluate an exponential function e^x. A first expression A*(x-ln(2)*K_n(x_f))+B is evaluated, by one or more computer processors in a single instruction multiple data (SIMD) architecture, as an integer and is read as a double. In the first expression, K_n(x_f) is a polynomial function of the degree n, xf is a fractional part of x/ln(2), A=2^52/ln(2), and B=1023*2^52. The result of reading the first expression as a double is returned as the value of the exponential function with respect to the variable x.