Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required by MinMaxScaler. #7

Open
JackRossProjects opened this issue Apr 15, 2019 · 6 comments

Comments

@JackRossProjects
Copy link

Trying to train the scalar with training data and smooth data:

smoothing_window_size = 2500
for di in range(0,10000,smoothing_window_size):
scaler.fit(train_data[di:di+smoothing_window_size,:])
train_data[di:di+smoothing_window_size,:] = scaler.transform(train_data[di:di+smoothing_window_size,:])

I get this error:
ValueError Traceback (most recent call last)
in
2 smoothing_window_size = 2500
3 for di in range(0,10000,smoothing_window_size):
----> 4 scaler.fit(train_data[di:di+smoothing_window_size,:])
5 train_data[di:di+smoothing_window_size,:] = scaler.transform(train_data[di:di+smoothing_window_size,:])

~\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py in fit(self, X, y)
306 # Reset internal state before fitting
307 self._reset()
--> 308 return self.partial_fit(X, y)
309
310 def partial_fit(self, X, y=None):

~\Anaconda3\lib\site-packages\sklearn\preprocessing\data.py in partial_fit(self, X, y)
332
333 X = check_array(X, copy=self.copy, warn_on_dtype=True,
--> 334 estimator=self, dtype=FLOAT_DTYPES)
335
336 data_min = np.min(X, axis=0)

~\Anaconda3\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
460 " minimum of %d is required%s."
461 % (n_samples, shape_repr, ensure_min_samples,
--> 462 context))
463
464 if ensure_min_features > 0 and array.ndim == 2:

ValueError: Found array with 0 sample(s) (shape=(0, 1)) while a minimum of 1 is required by MinMaxScaler.

Apologies if I'm missing something glaringly obvious but I'm at a loss.

@makamkkumar
Copy link

Facing the same issue have you found a work around please

@gigerbytes
Copy link

Make sure you download all the data from 1962 from the Yahoo page (set filter to Max & click Apply), otherwise the smoothing window goes over bounds.

The default download is only 20 records.

@karan842
Copy link

karan842 commented Mar 4, 2021

I cant understand this error and i also want a solution. Someone please explain !!

@nmshafie1993
Copy link

I am getting the same error, any solution?

@BogereMark879
Copy link

I want to build a flask API that connects to a Flutter mobile application, bellow is the code of the flask api;
import pickle
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from flask import Flask, request, jsonify
from flask_cors import CORS
import pandas as pd
import numpy as np
import nltk
import re

app = Flask(name)
CORS(app)

Load the model

model = pickle.load(open('similarity1.pkl', 'rb'))

Load the attractions and preferences data

attractions = pd.read_csv(r"C:\Users\Bogere\OneDrive\Desktop\Tourism\tourism_attractions.csv")
preferences = pd.read_csv(r"C:\Users\Bogere\OneDrive\Desktop\Tourism\userA_preferences.csv")
attractions = attractions[['item_id', 'name', 'experience_tags']]

Normalize the documents

nltk.download('stopwords')
stop_words = nltk.corpus.stopwords.words('english')
def normalize_document(document):
document = re.sub(r'[^a-zA-Z0-9\s]', '', document, re.I | re.A)
document = document.lower()
document = document.strip()
tokens = nltk.word_tokenize(document)
filtered_tokens = [token for token in tokens if token not in stop_words]
document = ' '.join(filtered_tokens)
return document

norm_corpus_attractions = attractions['experience_tags'].apply(normalize_document)
norm_corpus_preferences = preferences['preferences'].apply(normalize_document)

Compute the cosine similarity scores

tfidf_vectorizer = TfidfVectorizer(ngram_range=(1, 2), min_df=1)
tfidf_matrix_attractions = tfidf_vectorizer.fit_transform(norm_corpus_attractions)
tfidf_matrix_preferences = tfidf_vectorizer.transform(norm_corpus_preferences)
cosine_similarity_scores = cosine_similarity(tfidf_matrix_attractions, tfidf_matrix_preferences)
df_cosine_similarity = pd.DataFrame(cosine_similarity_scores)
df_cosine_similarity.index = df_cosine_similarity.index + 1
df_cosine_similarity.index.name = 'item_id'
df_cosine_similarity = df_cosine_similarity.rename(columns={0: 'similarity_score'})

@app.route('/', methods=['GET'])
def index():
if request.method == 'GET':
# Get the selected checkboxes from the user input
selected_preferences = request.args.getlist('title')

    # Normalize the selected preferences
    norm_selected_preferences = [normalize_document(pref) for pref in selected_preferences]

    # Compute the cosine similarity scores
    tfidf_matrix_selected_preferences = tfidf_vectorizer.transform(norm_selected_preferences)
    cosine_similarity_selected = cosine_similarity(tfidf_matrix_attractions, tfidf_matrix_selected_preferences)
    df_cosine_similarity_selected = pd.DataFrame(cosine_similarity_selected)
    df_cosine_similarity_selected.index = df_cosine_similarity_selected.index + 1
    df_cosine_similarity_selected.index.name = 'item_id'
    df_cosine_similarity_selected = df_cosine_similarity_selected.rename(columns={0: 'similarity_score'})

    # Merge the attractions data with the similarity scores
    attractions_with_similarity_scores = pd.merge(attractions, df_cosine_similarity_selected, on='item_id')

    # Sort the recommendations by similarity score in descending order
    recommendations = attractions_with_similarity_scores.sort_values(by='similarity_score', ascending=False)

    # Select the top N recommendations
    N = 5
    top_recommendations = recommendations['name'].tolist()[:N]

    # Return the recommendations as a JSON object
    return jsonify({'recommendations': top_recommendations})

if name == 'main':
app.run(debug=True)
but unfortunately the link provided by the flask function gives me the error bellow, how can i solve it, someone to help me please
ValueError: Found array with 0 sample(s) (shape=(0, 47)) while a minimum of 1 is required by TfidfTransformer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants