Neural Networks Beginnings - Carter Jade 2 стр.


 layers.Dense(10, activation="softmax"),

 ]

)

#Compilation of the model

model.compile(

 optimizer=keras.optimizers.Adam(learning_rate=0.001),

 loss=keras.losses.CategoricalCrossentropy(),

 metrics=["accuracy"],

)

#Loading audio file

audio_file = tf.io.read_file("audio.wav")

audio, _ = tf.audio.decode_wav(audio_file)

audio = tf.squeeze(audio, axis=-1)

audio = tf.cast(audio, tf.float32)

# splitting into segments

frame_length = 640

frame_step = 320

audio_length = tf.shape(audio)[0]

num_frames = tf.cast(tf.math.ceil(audio_length / frame_step), tf.int32)

padding_length = num_frames * frame_step audio_length

audio = tf.pad(audio, [[0, padding_length]])

audio = tf.reshape(audio, [num_frames, frame_length])

#Extracting MFCC features

mfccs = tf.signal.mfccs_from_log_mel_spectrograms(

 tf.math.log(tf.abs(tf.signal.stft(audio))),

 audio.shape[-1],

 num_mel_bins=13,

 dct_coefficient_count=13,

)

# Data preparation for training

labels = ["one", "two", "three", "four", "five", "six", "seven", "eight", "nine", "zero"]

label_to_index = dict(zip(labels, range(len(labels))))

index_to_label = dict(zip(range(len(labels)), labels))

text = "one two three four five six seven eight nine zero"

target = tf.keras.preprocessing.text.one_hot(text, len(labels))

X_train = mfccs[None, ]

y_train = target[None, ]

# Training the model

history = model.fit(X_train, y_train, epochs=10)

# Making predictions

predicted_probs = model.predict(X_train)

predicted_indexes = tf.argmax(predicted_probs, axis=-1)[0]

predicted_labels = [index_to_label[i] for i in predicted_indexes]

# Outputting results

print("Predicted labels:", predicted_labels)


This code implements automatic speech recognition using a neural network based on TensorFlow and Keras. The first step is to define the neural network architecture using Keras Sequential API. In this case, a recurrent LSTM layer is used, which takes in a sequence of 13-length sound segments. Then there are several fully connected layers with a relu activation function and one output layer with a softmax activation function, which outputs probabilities for each speech class.

Next, the model is compiled using the compile method. The Adam optimizer with a learning rate of 0.001 is chosen, the loss function is categorical cross-entropy, and the classification accuracy is used as the metric.

Then a sound file in the wav format is loaded, decoded using tf.audio.decode_wav, and transformed into float32 numerical values. The file is then split into fragments of length 640 with a step of 320. If the file cannot be divided into equal fragments, padding is added.


This code implements automatic speech recognition using a neural network based on TensorFlow and Keras. The first step is to define the architecture of the neural network using the Keras Sequential API. In this case, a recurrent LSTM layer is used, which takes in a sequence of 13-length sound snippets. Then there are several fully connected layers with the relu activation function, and one output layer with the softmax activation function, which outputs probabilities for each speech class.

Next, the model is compiled using the compile method. The optimizer chosen is Adam with a learning rate of 0.001, the loss function is categorical cross-entropy, and the classification accuracy is used as the metric.

Then, a sound file in the wav format is loaded and decoded using tf.audio.decode_wav, and transformed into float32 numerical values. The file is then split into fragments of length 640 with a step of 320. If the file cannot be evenly divided into fragments, padding is added.

Next, Mel-frequency cepstral coefficients (MFCC) features are extracted from each sound fragment using the tf.signal.mfccs_from_log_mel_spectrograms function. These extracted features are used for training the model.

To train the model, the data needs to be prepared. In this case, text is used that indicates all possible classes and the corresponding label for each class. For convenience, the text is converted into one-hot encoding using the tf.keras.preprocessing.text.one_hot method. The prepared data is then passed to the model for training using the fit method.

After training the model, the results are predicted on the same data using the predict method. The index with the highest probability and its corresponding class are selected.

Finally, the predicted class labels are outputted.


Recommender system

For convenience, let's describe the process in five steps:

Step 1: Data collection

The first step in creating a recommender system is data collection. This involves gathering data about users, such as their preferences, purchases, browsing history, and so on. This data can be obtained from various sources, such as databases or user logs.

Step 2: Data preparation

After the data is collected, it needs to be prepared. For example, data preprocessing may be required to clean it from noise and outliers. Various techniques can be used for this, such as standardization and normalization of the data.

Step 3: Model training

Once the data is prepared, we can proceed to model training. To create a recommender system, we can use various types of neural networks, such as convolutional neural networks or recurrent neural networks. The model should be trained on the training set of data.

Step 4: Model testing

After training the model, we need to test it to ensure that it works correctly. To do this, we can use a testing set of data. During testing, we can analyze metrics such as accuracy and recall.

Step 5: Model application

After the model has passed testing, it can be used to recommend content to users. For example, we can use the model to recommend science fiction books to a user who has previously purchased such books. In this case, the model can use data about the user to predict what they might be interested in.


The code for a recommender system will depend on what data about users and items is being used, as well as what neural network architecture is being employed. Below is an example code for a simple matrix factorization-based recommender system that utilizes user and item ratings data:


import numpy as np

#loading the data

ratings = np.array([

[5, 3, 0, 1],

[4, 0, 0, 1],

[1, 1, 0, 5],

[1, 0, 0, 4],

[0, 1, 5, 4],

])

# initializing the parameters

num_users, num_items = ratings.shape

num_factors = 2

learning_rate = 0.01

num_epochs = 1000

# initializing the user and item matrices

user_matrix = np.random.rand(num_users, num_factors)

item_matrix = np.random.rand(num_factors, num_items)


The code for a recommender system will depend on the type of user and item data being used, as well as the neural network architecture being used. Here is an example code for a simple matrix factorization-based recommender system that uses user and item ratings data:


import numpy as np

#load data

ratings = np.array([

[5, 3, 0, 1],

[4, 0, 0, 1],

[1, 1, 0, 5],

[1, 0, 0, 4],

[0, 1, 5, 4],

])

#initialize parameters

num_users, num_items = ratings.shape

num_factors = 2

learning_rate = 0.01

num_epochs = 1000

#initialize user and item matrices

user_matrix = np.random.normal(scale=1./num_factors, size=(num_users, num_factors))

item_matrix = np.random.normal(scale=1./num_factors, size=(num_factors, num_items))

#matrix factorization training

for epoch in range(num_epochs):

for i in range(num_users):

for j in range(num_items):

if ratings[i][j] > 0:

error = ratings[i][j] np.dot(user_matrix[i,:], item_matrix[:,j])

user_matrix[i,:] += learning_rate * (error * item_matrix[:,j])

item_matrix[:,j] += learning_rate * (error * user_matrix[i,:])

#predict ratings for all users and items

predicted_ratings = np.dot(user_matrix, item_matrix)

#recommend items for a specific user

user_id = 0

recommended_items = np.argsort(predicted_ratings[user_id])[::-1]

print("Recommendations for user", user_id)

print(recommended_items)


In this example, we used matrix factorization to build a recommender system. We initialized user and item matrices with random values and trained them based on known user and item ratings. We then used the obtained matrices to predict ratings for all users and items, and then recommended items based on these predictions for a specific user. In real systems, more complex algorithms and more diverse data can be used.


4. Automatic emotion detection.

Process description.

We import the necessary modules from TensorFlow.

We create a model using convolutional neural networks. The model takes input data in the form of a 48x48x1 pixel image. Conv2D, BatchNormalization, and MaxPooling2D layers are used to extract features from the image. The Flatten layer converts the obtained features into a one-dimensional vector. Dense, BatchNormalization, and Dropout layers are used to classify emotions into 7 categories (happiness, sadness, anger, etc.). We compile the model, specifying the optimizer, loss function, and metrics. We train the model on the training dataset using the validation dataset.We evaluate the accuracy of the model on the testing dataset. We use the model to predict emotions on new data.


import tensorflow as tf

from tensorflow import keras

from tensorflow.keras import layers

# Creating a model

model = keras.Sequential([

 layers.Conv2D(32, (3, 3), activation='relu', input_shape=(48, 48, 1)),

 layers.BatchNormalization(),

 layers.MaxPooling2D(pool_size=(2, 2)),

 layers.Dropout(0.25),

 layers.Conv2D(64, (3, 3), activation='relu'),

 layers.BatchNormalization(),

 layers.MaxPooling2D(pool_size=(2, 2)),

 layers.Dropout(0.25),

 layers.Conv2D(128, (3, 3), activation='relu'),

 layers.BatchNormalization(),

 layers.MaxPooling2D(pool_size=(2, 2)),

 layers.Dropout(0.25),

 layers.Flatten(),

 layers.Dense(256, activation='relu'),

 layers.BatchNormalization(),

 layers.Dropout(0.5),

 layers.Dense(7, activation='softmax')

])

# Compiling the model.

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Training the model

history = model.fit(train_data, train_labels, epochs=50, validation_data=(val_data, val_labels))

# Evaluation of the model

test_loss, test_acc = model.evaluate(test_data, test_labels)

print('Test accuracy:', test_acc)

# Using the model

predictions = model.predict(new_data)


This code creates a convolutional neural network for recognizing emotions on 48x48 pixel images.

The first layer uses a 3x3 convolution with 32 filters and a ReLU activation function that takes 48x48x1 input images. Then follow layers of batch normalization, max pooling with a 2x2 filter size, and dropout to help prevent overfitting.

Two additional convolutional layers with increased filter numbers and similar normalization and dropout layers are then added. A flattening layer follows, which converts the multidimensional input to a one-dimensional vector.

Next are two fully connected layers with ReLU activation and batch normalization, as well as dropout layers. The final layer contains 7 neurons and uses the softmax activation function to determine the probability of each of the 7 emotions.

The optimizer Adam, the categorical_crossentropy loss function, and the accuracy metric are used to compile the model. The model is trained on the training data for 50 epochs with validation on the validation data.

After training, the model is evaluated on the test data, and the accuracy of predictions is displayed. Then the model is used to predict emotions on new data.

Conclusion on Chapter 1:

In this chapter, we have covered the fundamental concepts underlying neural networks. We learned what a neuron is, how it works in a neural network, what weights and biases are, how a neuron makes decisions, and how a neural network is constructed. We also discussed the process of training a neural network and how it adjusts its weights and biases to improve prediction accuracy.

Назад