- Tapa dura: 738 páginas
- Editor: Springer; Edición: 1st ed. 2006. Corr. 2nd printing 2011 (15 de febrero de 2010)
- Colección: Information Science and Statistics
- Idioma: Inglés
- ISBN-10: 0387310738
- ISBN-13: 978-0387310732
- Valoración media de los clientes: 2 opiniones de clientes
Clasificación en los más vendidos de Amazon:
nº1.609 en Libros en idiomas extranjeros (Ver el Top 100 en Libros en idiomas extranjeros)
- n.° 7 en Libros en idiomas extranjeros > Ciencias, tecnología y medicina > Matemáticas > Estadística y probabilidad
- n.° 8 en Libros en idiomas extranjeros > Informática, internet y medios digitales > Medios digitales y diseño gráfico
- n.° 28 en Libros en idiomas extranjeros > Informática, internet y medios digitales > Ciencias informáticas
- Ver el Índice completo
Compara Precios en Amazon
+ EUR 3,15 de gastos de envío
+ Envío GRATIS
Pattern Recognition and Machine Learning (Information Science and Statistics) (Inglés) Tapa dura – 15 feb 2010
Comprados juntos habitualmente
Los clientes que compraron este producto también compraron
Descripción del producto
From the reviews:
"This beautifully produced book is intended for advanced undergraduates, PhD students, and researchers and practitioners, primarily in the machine learning or allied areas...A strong feature is the use of geometric illustration and intuition...This is an impressive and interesting book that might form the basis of several advanced statistics courses. It would be a good choice for a reading group." John Maindonald for the Journal of Statistical Software
"In this book, aimed at senior undergraduates or beginning graduate students, Bishop provides an authoritative presentation of many of the statistical techniques that have come to be considered part of ‘pattern recognition’ or ‘machine learning’. … This book will serve as an excellent reference. … With its coherent viewpoint, accurate and extensive coverage, and generally good explanations, Bishop’s book is a useful introduction … and a valuable reference for the principle techniques used in these fields." (Radford M. Neal, Technometrics, Vol. 49 (3), August, 2007)
"This book appears in the Information Science and Statistics Series commissioned by the publishers. … The book appears to have been designed for course teaching, but obviously contains material that readers interested in self-study can use. It is certainly structured for easy use. … For course teachers there is ample backing which includes some 400 exercises. … it does contain important material which can be easily followed without the reader being confined to a pre-determined course of study." (W. R. Howard, Kybernetes, Vol. 36 (2), 2007)
"Bishop (Microsoft Research, UK) has prepared a marvelous book that provides a comprehensive, 700-page introduction to the fields of pattern recognition and machine learning. Aimed at advanced undergraduates and first-year graduate students, as well as researchers and practitioners, the book assumes knowledge of multivariate calculus and linear algebra … . Summing Up: Highly recommended. Upper-division undergraduates through professionals." (C. Tappert, CHOICE, Vol. 44 (9), May, 2007)
"The book is structured into 14 main parts and 5 appendices. … The book is aimed at PhD students, researchers and practitioners. It is well-suited for courses on machine learning, statistics, computer science, signal processing, computer vision, data mining, and bio-informatics. Extensive support is provided for course instructors, including more than 400 exercises, lecture slides and a great deal of additional material available at the book’s web site … ." (Ingmar Randvee, Zentralblatt MATH, Vol. 1107 (9), 2007)
"This new textbook by C. M. Bishop is a brilliant extension of his former book ‘Neural Networks for Pattern Recognition’. It is written for graduate students or scientists doing interdisciplinary work in related fields. … In summary, this textbook is an excellent introduction to classical pattern recognition and machine learning (in the sense of parameter estimation). A large number of very instructive illustrations adds to this value." (H. G. Feichtinger, Monatshefte für Mathematik, Vol. 151 (3), 2007)
"Author aims this text at advanced undergraduates, beginning graduate students, and researchers new to machine learning and pattern recognition. … Pattern Recognition and Machine Learning provides excellent intuitive descriptions and appropriate-level technical details on modern pattern recognition and machine learning. It can be used to teach a course or for self-study, as well as for a reference. … I strongly recommend it for the intended audience and note that Neal (2007) also has given this text a strong review to complement its strong sales record." (Thomas Burr, Journal of the American Statistical Association, Vol. 103 (482), June, 2008)
"This accessible monograph seeks to provide a comprehensive introduction to the fields of pattern recognition and machine learning. It presents a unified treatment of well-known statistical pattern recognition techniques. … The book can be used by advanced undergraduates and graduate students … . The illustrative examples and exercises proposed at the end of each chapter are welcome … . The book, which provides several new views, developments and results, is appropriate for both researchers and students who work in machine learning … ." (L. State, ACM Computing Reviews, October, 2008)
"Chris Bishop’s … technical exposition that is at once lucid and mathematically rigorous. … In more than 700 pages of clear, copiously illustrated text, he develops a common statistical framework that encompasses … machine learning. … it is a textbook, with a wide range of exercises, instructions to tutors on where to go for full solutions, and the color illustrations that have become obligatory in undergraduate texts. … its clarity and comprehensiveness will make it a favorite desktop companion for practicing data analysts." (H. Van Dyke Parunak, ACM Computing Reviews, Vol. 49 (3), March, 2008)
Reseña del editor
This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory.Ver Descripción del producto
No es necesario ningún dispositivo Kindle. Descárgate una de las apps de Kindle gratuitas para comenzar a leer libros Kindle en tu smartphone, tablet u ordenador.
Obtén la app gratuita:
Detalles del producto
Si eres el vendedor de este producto, ¿te gustaría sugerir ciertos cambios a través del servicio de atención al vendedor?
¿Qué otros productos compran los clientes tras ver este producto?
Opiniones de clientes
Principales opiniones de clientes
Ha surgido un problema al filtrar las opiniones justo en este momento. Vuelva a intentarlo en otro momento.
En general la dificultad del libro es altísima. Para seguirlo hay que tirar de otros libros complementarios como Elements of Statistical Learning y Matrix Cookbook (para demostraciones).
Opiniones de clientes más útiles en Amazon.com
In my opinion, despite the recent publication of Kevin Murphy's very comprehensive ML book, Bishop is still a better read. This is mostly because of his incredible clarity, but the book has other virtues: best in class diagrams, judiciously chosen; a lot of material, very well organized; excellent stage setting (the first two chapters). Now, sometimes he's a bit cryptic, for example, the proof that various kinds of loss lead to conditional median or mode is left as an exercise (ex 1.27). Murphy actually discusses it in some detail. This is true in general: Murphy actually discusses many things that Bishop leaves to the reader. I thought chapters three and four could have been more detailed, but I really have no other complaints.
Please note that in order to get an optimal amount out of reading this book you should already have a little background in linear algebra, probability, calculus, and preferably some statistics. The first time I approached it was without any background and I found it a bit unfriendly and difficult; this is no fault of the book, however. Still, you don't need that much, just the basics.
Update: I should note that there are some puzzling omissions from this book. E.g. f-score & confusion matrices are not mentioned (see Murphy section 5.7.2) - it would have been very natural to mention these concepts in ch 1, along with decision theory. Nor is there much on clustering, except for K-means (see Murphy ch 25). Not a huge deal, it's easy to get these concepts from elsewhere. I recommend using Murphy as and when you need, to fill in gaps.
One more update: I've been getting into Hastie et al's ESL recently, and I'm really impressed with it so far - I think the practitioner should probably get familiar with both ESL and PRML, as they have complementary strengths and weaknesses. ESL is not very Bayesian at all; PRML is relentlessly so. ESL does not use graphical models or latent variables as a unifying perspective; PRML does. ESL is better on frequentist model selection, including cross-validation (ch 7). I think PRML is better for graphical models, Bayesian methods, and latent variables (which correspond to chs 8-13) and ESL better on linear models and density based methods (and other stuff besides). Finally, ESL is way better on "local" models, like kernel regression & loess. Your mileage may vary...They are both excellent books. ESL seems a bit more mathematically dense than PRML, and is also better for people who are in industry as versus academia (I was in the latter but now in the former),
- not mathematically heavy; lots of good heuristics that capture the math without delving too far in
- choice of topics and their discussion (e.g. a great place to learn about kernel methods and graphical models)
- easy to get hooked on if you mind the gaps
- read below...
While the exposition is spotty (compare e.g. with Feller or Gelman), the author manages to follow a mostly linear exposition on fascinating topics.
The book would highly benefit from editing provided by someone with a solid math background. In particular, there are more good mistakes than bad mistakes... Often when speaking with people with more stats background than me, the conversation is isomorphic to:
Me: "...therefore, this statement is wrong. I think what he meant was..."
Bro: "Ah yes. But you get it, that's what he meant"
Me: "Then why didn't he write it?"
But at least dialogues like these help cement ideas...
Please correct me if any of the following contentions are wrong (I may update as I continue to read):
Some parts are not even wrong, for example:
Sec 2.1, paragraph above Eq. 2.19
"We see that this sequential approach to learning arises naturally when we adopt a Bayesian viewpoint. It is independent of the choice of prior and of the likelihood functions and depends only on the assumption of i.i.d data"
First, if you follow the thread of this section and therefore go back to the contrived coin-flipping example, you would see that in the non-Bayesian point of view estimates are also updated in a sequence of experiments. Hence, a Bayesian point of view in this case is no more "natural" than a frequentist. Second, by definition of i.i.d., a single fixed distribution is postulated to exist, and therefore a prior is in fact chosen: how do you define a posterior without a prior? But ok, I think I get it: a sequential approach fits in nicely with the Bayesian point of view - I agree, and that's all that needs to be said.
Same section, following 2.3.5, the statement(s) following Eq. (2.20)
"Note that in the limit of an infinitely large data set m, l -> infinity the result (2.20) reduces to the maximum likelihood (2.8)."
First, if F is a function of x, then the resulting limit for x to infinity must not involve x. Plus, the order and/or direction of his m and l in the limit is ambiguous. Second, what he meant to say is that for m and l both sufficiently large compared with a and b, we get that (2.20) reduces to (2.8).
3rd paragraph before 2.2:
"For a finite data set, the posterior mean for mu always lies between the prior mean and the maximum likelihood estimate for mu corresponding to the relative frequencies of events given by (2.7)."
Again we are told to forget that the choice of a prior makes a difference. It seems the above statement is false: we may choose a prior that is heavily weighted on a single point so that this prior's mean is greater than mle.
Paragraphs directly above the beginning of 2.2:
"In fact, we might wonder whether it is a general property of Bayesian learning that, as we observe more and more data, the uncertainty represented by the posterior distribution will steadily decrease..."
" this result shows that, on average, the posterior variance of theta is smaller than the prior variance."
The "result", i.e. Eq (2.24) is an assertion of the form: "Suppose a,b, c > 0, c is fixed, and c = a +b. Then, if b goes up, a must go down." I don't see how this relates to what seemed to be his premise that increasing the size of a data set (sequentially or not) has the seemingly desired effect of reducing posterior variance. I suspect there are in fact limiting results in special cases that show the desired "steady" reduction in posterior variance...I wish he would have referenced them
Section 2.3, following Eq. 2.44
"... we note that the matrix \Sigma can be taken to be symmetric..."
Actually, by definition any covariance matrix *is* symmetric.
I could go on...
All this said, it's worth repeating:
I like the book, and not only because its mistakes or sometimes shady logic encourage the interested reader to try and discover correct/less wrong statements.